Indexed ISO Metadata Fields

All of the following fields are indexed for every data type (collection and granule). Click the links for more information on each field (i.e., XPaths used for parsing, explanations of created fields, sub-fields in instances where the listed field is actually an object, etc.).



This field must be present.



It is a best practice for this to be the collection id as it is in OneStop for this granule. Either path is ingested, in this order:












This field must be present and have a value of “granule” (not case sensitive) in order for a record to be identified as such. Otherwise, it will be identified as a collection record.








Keywords, General & Specific

The top level path of all keyword objects is:



For every keyword found at the top level path except accession values (see next sub-section), a keywords object is created with the following sub-fields. Sub-fields in the table below are relative to the above path. For values, which is listed twice, either path is accepted.

Sub-Field XPath
namespace ./gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString[1]
type ./gmd:type/gmd:MD_KeywordTypeCode/@codeListValue[1]
values ./gmd:keyword/gco:CharacterString
values ./gmd:keyword/gmx:Anchor

For all keyword values, leading and trailing whitespace characters are trimmed.


The accessionValues are extracted in the same manner as other keyword text. However, since they are not actually keywords, they are stored separately. These are determined by a namespace value equal to ‘NCEI ACCESSION NUMBER’ (not case-sensitive) and only the extracted values are stored.

GCMD Keywords

GCMD Keywords are a special set of hierarchical Earth science keywords that we support as “facet filters”. These are determined by the namespace text containing either ‘GCMD’ or ‘Global Change Master Directory’ (not case-sensitive) and, for the specific categories:

Category Namespace Text (not case-sensitive)
Science ‘earth science’ AND NOT ‘services’
ScienceServices ‘earth science services’
Locations ‘location’ OR ‘place’
Instruments ‘instrument’
Platforms ‘platform’
Projects ‘project’
DataCenters ‘data center’
HorizontalResolution ‘horizontal data resolution’
VerticalResolution ‘vertical data resolution’
TemporalResolution ‘temporal data resolution’

These keywords are normalized before indexing to be title-cased; have all excess internal whitespace characters removed; to be capitalized around the delimiters ' ', '/', '.', '(', '-' and '_'; and to attempt to maintain any acronyms present in keywords of the format ‘Short Name > Long Name’. Also of note, ‘Earth Science > ‘ and ‘Earth Science Services > ‘ are removed from the text for science and science services keywords, respectively.




The top level path of this object is:


Sub-fields in the table below are relative to the above path, except for description (full path given in this case). For fields listed twice, either path is accepted but the first path takes precedence.

In the event that the values found for either beginDate or endDate represent a year further back than -100,000,000, only the respective year field will be populated and the date field will be null.

Sub-Field XPath
beginDate ./gml:TimePeriod/gml:beginPosition[1]
beginDate ./gml:TimePeriod/gml:begin/gml:TimeInstant/gml:timePosition[1]
beginIndeterminate ./gml:TimePeriod/gml:beginPosition/@indeterminatePosition[1]
beginIndeterminate ./gml:TimePeriod/gml:begin/gml:TimeInstant/gml:timePosition/@indeterminatePosition[1]
beginYear This field is either parsed from beginDate or the text found in the path if it is a year value.
endDate ./gml:TimePeriod/gml:endPosition[1]
endDate ./gml:TimePeriod/gml:end/gml:TimeInstant/gml:timePosition[1]
endIndeterminate ./gml:TimePeriod/gml:endPosition/@indeterminatePosition[1]
endIndeterminate ./gml:TimePeriod/gml:end/gml:TimeInstant/gml:timePosition/@indeterminatePosition[1]
endYear This field is either parsed from endDate or the text found in the path if it is a year value.
instant ./gml:TimeInstant/gml:timePosition[1]
instantIndeterminate ./gml:TimeInstant/gml:timePosition/@indeterminatePosition[1]
description /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:description/gco:CharacterString[1]


The top level path of this object is:


From here, the vertices of the bounding box (which is stored as a GeoJSON Polygon) are determined by ./gmd:westBoundLongitutude/gco:Decimal (plus east longitude, and north & south latitude). If west == east AND north == south, the geometry is interpreted as a GeoJSON Point. Likewise, if west == east OR north == south, the geometry is interpreted as a GeoJSON LineString. In the future, a geographic element in the metadata that is NOT a bounding box could be used to more accurately depict the spatial bounding.


This field is calculated upon ingest of the spatialBounding and set to true if and only if the data is a bounding box from [-180, 180] longitude and [-90 to 90] latitude.


The top level path of this object is:


Sub-fields in the table below are relative to the above path. For fields listed twice, either path is accepted but the first path takes precedence.

Sub-Field XPath
instrumentIdentifier ./gmi:identifier/gmd:MD_Identifier/gmd:code/gco:CharacterString[1]
instrumentIdentifier ./gmi:identifier/gmd:MD_Identifier/gmd:code/gmx:Anchor[1]
instrumentType ./gmi:type/gco:CharacterString[1]
instrumentType ./gmi:type/gmx:Anchor[1]
instrumentDescription ./gmi:description/gco:CharacterString[1]


The top level path of this object is:


Sub-fields in the table below are relative to the above path. For fields listed twice, either path is accepted but the first path takes precedence.

Sub-Field XPath
operationIdentifier ./gmi:identifier/gmd:MD_Identifier/gmd:code/gco:CharacterString[1]
operationIdentifier ./gmi:identifier/gmd:MD_Identifier/gmd:code/gmx:Anchor[1]
operationType ./gmi:type/gmi:MI_OperationTypeCode/@codeListValue[1]
operationStatus ./gmi:status/gmd:MD_ProgressCode/@codeListValue[1]
operationDescription ./gmi:description/gco:CharacterString[1]


The top level path of this object is:


Sub-fields in the table below are relative to the above path. For fields listed twice, either path is accepted but the first path takes precedence.

Sub-Field XPath
platformIdentifier ./gmi:identifier/gmd:MD_Identifier/gmd:code/gco:CharacterString[1]
platformIdentifier ./gmi:identifier/gmd:MD_Identifier/gmd:code/gmx:Anchor[1]
platformDescription ./gmi:description/gco:CharacterString[1]
platformSponsor ./gmi:sponsor/gmd:CI_ResponsibleParty/gmd:organisationName//gco:CharacterString


The top level path of this object is:


Sub-fields in the table below are relative to the above path.

Sub-Field XPath
name ./gmd:name/gco:CharacterString[1]
version ./gmd:version/gco:CharacterString[1]

The top level path of this object is:


Sub-fields in the table below are relative to the above path.

Sub-Field XPath
linkName ./gmd:name/gco:CharacterString[1]
linkProtocol ./gmd:protocol/gco:CharacterString[1]
linkUrl ./gmd:linkage/gmd:URL[1]
linkDescription ./gmd:description/gco:CharacterString[1]
linkFunction ./gmd:function/gmd:CI_OnLineFunctionCode/@codeListValue[1]


This object is determined by the path:

/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification//gmd:CI_ResponsibleParty/gmd:role/gmd:CI_RoleCode[@codeListValue=’pointOfContact’ or @codeListValue=’distributor’]

Sub-fields in the table below are relative to the bolded part of the above path. For fields listed twice, either path is accepted but the first path takes precedence.

Sub-Field XPath
individualName ./gmd:individualName/gco:CharacterString[1]
individualName ./gmd:individualName/gmx:Anchor[1]
organizationName ./gmd:organisationName/gco:CharacterString[1]
organizationName ./gmd:organisationName/gmx:Anchor[1]
positionName ./gmd:positionName/gco:CharacterString[1]
positionName ./gmd:positionName/gmx:Anchor[1]
role ./gmd:role/gmd:CI_RoleCode/@codeListValue[1]
email ./gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString[1]
phone ./gmd:contactInfo/gmd:CI_Contact/gmd:phone/gmd:CI_Telephone/gmd:voice/gco:CharacterString[1]


This object is determined by the path:

/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification//gmd:CI_ResponsibleParty/gmd:role/gmd:CI_RoleCode[@codeListValue=’resourceProvider’ or @codeListValue=’originator’ or @codeListValue=’principalInvestigator’ or @codeListValue=’author’ or @codeListValue=’collaborator’ or @codeListValue=’coAuthor’]

Sub-fields in the table below are relative to the bolded part of the above path. For fields listed twice, either path is accepted but the first path takes precedence.

Sub-Field XPath
individualName ./gmd:individualName/gco:CharacterString[1]
individualName ./gmd:individualName/gmx:Anchor[1]
organizationName ./gmd:organisationName/gco:CharacterString[1]
organizationName ./gmd:organisationName/gmx:Anchor[1]
positionName ./gmd:positionName/gco:CharacterString[1]
positionName ./gmd:positionName/gmx:Anchor[1]
role ./gmd:role/gmd:CI_RoleCode/@codeListValue[1]
email ./gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString[1]
phone ./gmd:contactInfo/gmd:CI_Contact/gmd:phone/gmd:CI_Telephone/gmd:voice/gco:CharacterString[1]


This object is determined by the path:


Sub-fields in the table below are relative to the bolded part of the above path. For fields listed twice, either path is accepted but the first path takes precedence.

Sub-Field XPath
individualName ./gmd:individualName/gco:CharacterString[1]
individualName ./gmd:individualName/gmx:Anchor[1]
organizationName ./gmd:organisationName/gco:CharacterString[1]
organizationName ./gmd:organisationName/gmx:Anchor[1]
positionName ./gmd:positionName/gco:CharacterString[1]
positionName ./gmd:positionName/gmx:Anchor[1]
role ./gmd:role/gmd:CI_RoleCode/@codeListValue[1]
email ./gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString[1]
phone ./gmd:contactInfo/gmd:CI_Contact/gmd:phone/gmd:CI_Telephone/gmd:voice/gco:CharacterString[1]













Note: Text ‘cite’ is not case-sensitive.


The top level path of this object is:


Sub-fields in the table below are relative to the above path.

Sub-Field XPath
title ./gmd:title/gco:CharacterString[1]
date ./gmd:date/gmd:CI_Date/gmd:date/gco:Date[1]
links.linkName .//gmd:CI_OnlineResource/gmd:name/gco:CharacterString
links.linkProtocol .//gmd:CI_OnlineResource/gmd:protocol/gco:CharacterString
links.linkUrl .//gmd:CI_OnlineResource/gmd:linkage/gmd:URL
links.linkDescription .//gmd:CI_OnlineResource/gmd:description/gco:CharacterString
links.linkFunction .//gmd:CI_OnlineResource/gmd:function/gmd:CI_OnLineFunctionCode/@codeListValue


The top level path of this object is:


Sub-fields in the table below are relative to the above path.

Sub-Field XPath
title ./gmd:title/gco:CharacterString[1]
date ./gmd:date/gmd:CI_Date/gmd:date/gco:Date[1]
links.linkName .//gmd:CI_OnlineResource/gmd:name/gco:CharacterString
links.linkProtocol .//gmd:CI_OnlineResource/gmd:protocol/gco:CharacterString
links.linkUrl .//gmd:CI_OnlineResource/gmd:linkage/gmd:URL
links.linkDescription .//gmd:CI_OnlineResource/gmd:description/gco:CharacterString
links.linkFunction .//gmd:CI_OnlineResource/gmd:function/gmd:CI_OnLineFunctionCode/@codeListValue











DSMM Values

A record has DSMM values if the following XPath is present:

/gmi:MI_Metadata/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:report/gmd:DQ_ConceptualConsistency[gmd:nameOfMeasure/gco:CharacterString=’Data Stewardship Maturity Assessment’]

From here, every DSMM value is collected with the path:


where the measure is determined by:


and the score is given by:


Measures are:

The score text is mapped to a numerical value as defined in the table below:

CodeListValue Numerical Equivalent
notAvailable 0
adHoc 1
minimal 2
intermediate 3
advanced 4
optimal 5

The dsmmAverage field is simply the mean average calculation of the numerical scores of the given measures.






Depreciating- this field will be empty for newly created documents but still exists in the index for older records. The entire block of XML is ingested and stored as a Base64-encoded string object. Multiple sections results in an array of these strings.


The top level object of this object is


Sub-fields in the table below are relative to the above path.

Sub-Field XPath
title .//CI_Citation/gmd:title/gco:CharacterString
alternativeTitle .//CI_Citation/gmd:title/gco:CharacterString
description ./abstract/gco:CharacterString
links.linkName .//srv:SV_OperationMetadata//gmd:CI_OnlineResource/gmd:name/gco:CharacterString[1]
links.linkProtocol .//srv:SV_OperationMetadata//gmd:CI_OnlineResource/gmd:protocol/gco:CharacterString[1]
links.linkUrl .//srv:SV_OperationMetadata//gmd:CI_OnlineResource/gmd:linkage/gmd:URL[1]
links.linkDescription .//srv:SV_OperationMetadata//gmd:CI_OnlineResource/gmd:description/gco:CharacterString[1]
links.linkFunction .//srv:SV_OperationMetadata//gmd:CI_OnlineResource/gmd:function/gmd:CI_OnLineFunctionCode/@codeListValue[1]

