Metadata and XML
Technologies and techniques used to support metadata creation, management, translation, and publication for data discovery and access.
Metadata Automation using XML Technologies
There can be many obstacles to creating and maintaining standardized metadata. Production, validation, and management of metadata records can be challenging and time consuming, and the content sources, formats, storage structures and standards can be in a wide variety of forms. Metadata creation often is time consuming initially because some metadata standards are complex and difficult to implement, and tools are limited.
Metadata, the documentation of data, can be represented in a number of standards and formats apart from the FGDC Content Standard for Digital Geospatial Metadata (CSDGM) and its profiles. Some standards predate the FGDC CSDGM, and others were created to meet the specific needs of particular communities. To support data management and data discovery systems, and to capture and convey information to users, many discipline- or community-specific metadata standards have been developed. Examples include Directory Interchange Format (DIF), Ecological Metadata Language (EML), Sensor Model Language (SensorML), Climate Science Modeling Language (CSML), and NetCDF Markup Language (NcML). Additionally, the International Organization for Standardization (ISO) provides a series of standards used to describe geographic information. This variety of available standards has created some interoperability and compatibility issues. Many conventional metadata creation and validation methods in use today do not readily address these issues.
The ability to take the necessary information and render it in a form and standard needed by the client/user/consumer is critical for the data’s discovery, access, use, and preservation. Extensible Markup Language (XML) techniques are being applied to automate metadata creation and translation and to provide a way to overcome numerous obstacles to producing and maintaining relevant metadata.
The process for using XML includes developing a representative document of what a source (e.g., a data model) contains, and mapping that source to a representative document of the desired output, or target. These representative documents are called schemas (.xsd). This mapping between the source schema and the target schema defines a transform (.xslt). The transform is then applied to the source XML to create the desired output. Programmatic metadata generation provides many other benefits, such as reduced effort, consistency, enhanced accuracy, and improved efficiency.
Downloads
NCDDC Initiatives
Fine-tuning the transforms takes collaborative input to ensure the accuracy and success of products from these transforms. The NCDDC Metadata Team has been working with the FGDC Metadata Working Group; NOAA’s National Ocean Survey, National Geophysical Data Center, and National Ocean Data Center; the U.S. Geological Survey; and various other organizations to establish a collaborative Metadata Transform Working Group. The focus of this group, for the time being, is to collaboratively build transforms (using XSLT 2.0) for the transition of metadata among such standards as FGDC, North American Profile, and ISO.
The goal of this collaborative effort is to produce metadata transforms among standards, libraries that support various conversions and best practices for their applications. The results will benefit not only NOAA but also the greater metadata community, resulting in a coherent and complete package of products to transition among metadata standards.
To date the following items have been checked by the Metadata Transform Working Group and are considered “close to finalized.” Revised versions will be posted here as they are available. As the development of these transforms and libraries for translation of content is a collaborative effort, any questions or comments can be sent to ncddc.metadata@noaa.gov to be passed along to the working group.
- FGDC CSDGM to ISO Transforms - A zip file containing the FGDC CSDGM to ISO Version 1.2 transform using the MD_Metadata root. Once unzipped, there will be four .xslt files, FGDC CSDGM to ISO 19115, FGDC CSDGM to ISO 19110 (FC), FGDC CSDGM to ISO 19111 (MD_CRS), and FGDC CSDGM to ISO 19111 (baseCRS).
- FGDC CSDGM to ISO Test Cases - This is a zip file that contains input and output files of multiple test cases for transforming FGDC CSDGM xml records to ISO 19115, ISO 19110 (Feature Catalog), and ISO 19111 (CRS) xml records. These records coordinate with the CSDGM to ISO v1 xslt.
- FGDC CSDGM to ISO Crosswalk - A crosswalk from FGDC CSDGM to ISO 19115. This crosswalk incorporates feature catalogs (ISO 19110) and coordinate reference systems (ISO 19111) while also providing comments for suggested best practices.
- FGDC BIO to ISO Transforms - A zip file containing the FGDC Biological Profile to ISO Version 1.0 transform using the MD_Metadata root. Once unzipped, there will be four .xslt files, FGDC Bio to ISO 19115, FGDC Bio to ISO 19110 (FC), FGDC Bio to ISO 19111 (MD_CRS), and FGDC Bio to ISO 19111 (baseCRS).
- FGDC BIO to ISO Test Cases - This is a zip file that contains input and output files of multiple test cases for transforming FGDC Biological Profile xml records to ISO 19115, ISO 19110 (Feature Catalog), and ISO 19111 (CRS) xml records.
- FGDC BIO to ISO Crosswalk - A crosswalk from FGDC Biological Profile to ISO 19115. This crosswalk also incorporates feature catalogs (ISO 19110) and coordinate reference systems (ISO 19111). The goal of this crosswalk is to document the transform developed by the working group between the FGDC-STD-001.1-1999 and ISO 19115:2003. Please pay special attention to comments that are italicized.
- FGDC-STD-001-1998 Schema - Download is a zip file that contains an updated FGDC Content Standard for Digital Geospatial Metadata (CSDGM) FGDC-STD-001-1998 schema. Updates are documented within the schema. This schema consists of 11 .xsd files once unzipped. FGDC-STD-001-1998.xsd is the master file that includes the .xsd file associated with each section of the standard. The FGDC-STD-001-1998 .xsd files that have been edited. The edits include corrected errors, edited domains, and extended annotation.
- FGDC-STD-001.1 1999 Schema - A zip file that contains an updated FGDC Biological Profile FGDC-STD-001.1-1999 schema. Updates are documented within the schema. This schema consists of 11 .xsd files once unzipped. FGDC-STD-001.1-1999.xsd is the master file that includes the .xsd file associated with each section of the standard. The edits include corrected errors, edited domains, and extended annotation.
- FGDC-STD-001.2 2001 Schema - A zip file that contains an updated FGDC Shoreline Profile FGDC-STD-001.2-2001 schema. Updates are documented within the schema. This schema consists of 11 .xsd files once unzipped. FGDC-STD-001.2-2001.xsd is the master file that includes the .xsd file associated with each section of the standard.
- FGDC-STD-012-2002 Schema - Download is a zip file that contains an updated FGDC Remote Sensing Extensions FGDC-STD-012-2002 schema. Updates are documented within the schema. This schema consists of 14 .xsd files once unzipped. FGDC-STD-012-2002.xsd is the master file that includes the .xsd file associated with each section of the standard. The edits include corrected errors, edited domains, and extended annotation.
- NOAA Technical Memorandum (NODC-NCDDC-1 DRAFT) - Drafted NOAA Tech Memo "Automated Metadata Generation Using Extensible Markup Language (XML) Techniques."
Projects Currently Employing XML Technologies
- West Coast Observing System (WCOS) - The West Coast Observing System (WCOS) is an end-to-end data management
system created in partnership with NOAA's Office of National Marine
Sanctuaries. EML metadata records are translated to FGDC metadata via XSLTs.
- Cruise Information Management System (CIMS) - The Cruise Management Information System (CIMS) is the cornerstone of the Office of Ocean Exploration and Research data management process. Information from CIMS is translated to FGDC and MARC metadata via XSLTs.
-
Regional Ecosystem Data Portal (REDM) - Regional Ecosystem Data Management (REDM) provides a coordinated data management system and data discovery mechanism for atmospheric, oceanographic, and terrestrial physical sciences to facilitate sustained economic growth, scientifically sound environmental management, and public safety to the Nation and the international community. FGDC metadata records from MERMAid are translated to REDM records via XSLTs to provide semantic search capabilities.
- Florida Geospatial Assessment of Marine Ecosystems (GAME) - Florida Geospatial Assessment (GAME) records were converted to FGDC CSDGM metadata and ingested into MERMAid and REDM via XSLTs. This work involved mapping database content to FGDC metadata elements, developing the XSLTs for the mapping, and applying the XSLTs to an XML representation of the database to create the FGDC metadata.
- Louisiana Department of Natural Resources (LDNR) - The Louisiana Office of Coastal Protection and Restoration, Applied Coastal Engineering & Science (LACES) Division partnered with NCDDC to map databases of coastal data to create FGDC metadata for all LDNR Strategic Online Natural Resources Information System (SONRIS) coastal data via XSLTs.
Non government sites