On 25 September 2012 representatives met from Bavaria, Berlin, Bremen, Hamburg and Baden-Wuerttemberg and from PortalU and GDI-DE at our institute Fraunhofer FOKUS in Berlin to discuss the metadata structure for OGDP. It was also discussed how existing data offerings can be converted into the OGDP.

Harvesting refers to the merging of metadata from different catalogs. As part of OGPD the metadata of that workshop participants as well as from DESTATIS are harvested insofar as they meet the minimum criteria for Open Data: Only those records, documents and applications are accepted which have a freely and available electronic resource, description, and a well-defined license.

For that I discussed the proposed metadata structure. It was re-adjusted especially regarding unique identifier to uniquely trace the origin and detection of duplicates, of dealing with contact details, the detection of open licenses and the geographical coverage. In addition, the main categories were discussed for the classification of datasets, documents and applications, and summarized in the following 14 main categories:

  • Economics and Labour
  • Transport and Traffic
  • Environment and Climate
  • Geography, Geology and Geodata
  • Health
  • Consumer Protection
  • Infrastructure, Construction and Housing
  • Education and Science
  • Public administration, Budget and Tax
  • Law and Justice
  • Social
  • Culture, Leisure, Sport and Tourism
  • Population
  • Politics and Elections

These primary categories are the basic classification and supplemented by specific, for example, subject-specific, sub-categories. For harvesting existing categorizations, such as in INSPIRE or EVAS, are mapped to these 14 categories.

After clarification of the metadata structure as a target to be provided for data, documents and applications, various ways to provide datasets for OGPD were discussed. As a result, four different ways are going to be implemented and offered:

  • Passive providing by CSW, which is used for example for the spatial data catalog and PortalU
  • Passive providing by CKAN / JSON, which is used for example in Berlin, Hamburg and Bremen
  • Active providing by CKAN API, which will be used for example by Bavaria
  • Manually record by form, which is for example used by the Ministry of Finance for the fiscal data

The main result of our harvesting workshop is certainly the revised metadata structure, which is now available on github.com.

Administrating the metadata structure for OGPD on GitHub allows transparent, collaborative development including version control. Change requests can be made public, the history of the metadata structure is documented and the current status is always visible.

Florian Marienfeld and Thomas Scheel just have added an HTML representation of the JSON schema of metadata structure for OGPD that makes the metadata structure more readable and easier to understand.

We look forward to your comments and suggestions to the metadata structure and / or to harvesting – directly in GitHub, but just as happy here.

Creative Commons Lizenzvertrag

This work and content is licensed under a Creative Commons Attribution 3.0 Unported License.

About the author: Ina Schieferdecker

1 Comment

  1. Pingback: Zu den Entwicklungskosten GovData | Datenjournalist

Leave a Reply

Your email address will not be published. Required fields are marked *