One of the main goals of the GovData.de prototype is to unite as many open data sets from Germany as possible in a single catalogue. Thspre biggest part is automatically imported by so-called harvesters. In this article we provide you with an overview on which tools have been used and how useful they have proven.
More »

Everything ready! We are looking forward a great 1st International Open Data Dialog in Berlin on Dec. 5-6, 2012 at Fraunhofer FOKUS

Having built the first German Open Data portal and currently building the German Open Government Platform with a strong emphasis on open government data, we now invite all our friends, partners, colleagues and all interested people to discuss about innovative business models based on open data. We like to discuss how to promote the potentials of new value-added services using open data offers, how to design, develop and deploy infrastructure and tools for the provision and processing of open data, in combination with closed commercial and private data as well as how to foster transparency in society, business, politics and administration by use of open data.
More »

The Open Government Platform for Germany (OGPD) is an access portal for electronic resources in public administration, in particular data, but also documents and applications. It bundles locally maintained files in one convenient interface and provides in particular a central access point for citizens in general and in particular developers, data journalists, governments and business. It also provides users with a feedback channel to the data providers within the authorities.

To fulfill this purpose, the platform includes two main components: a content management system (CMS) and a data catalog. The CMS provides for the management of editorial content such as information pages, links, news, opportunities for comment and reviews by users and supports an integrated view of the data catalog. In the catalog the metadata describing the data, documents and applications, are kept, which in turn refer to distributed data offers (available online files or services).

This architectural pattern occurs frequently in similar portals. Differences arise mainly in the choice of software products for the components and the way how they interact with each other. Choosing Liferay as CMS and CKAN as data catalog is referred to the OGPD study. Here, we only want to explain how they fit together and can be used by the actors (for example, user or editor) of the platform.

At its core is the Liferay CMS that provides most of the functionality known as portlets in a web interface. Editorial content such as articles and blog post are created right here. The contents of the data catalog are displayed using search fields and result lists. Data providers can register new or update existing datasets via the web form.

In addition to queries/edits via CMS, the data catalog can be accessed directly via a REST interface. With this, data providers can automate the release of their data in the OGPD.

For such data providers who catalog for their own metadata, the harvesting component come into play. It allows to “harvest” existing catalogs, that is to import their content while filtering transforming it into the metadata structures of the OGPD. For OGPD the catalogs of spatial data infrastructure, PortalU, destatis, Berlin, Bremen and Hamburg are currently read via INSPIRE CSW respectively CKAN API. With respect to the open data criteria only such datsets are considered that have an electronic resource, a description and a well-defined license.

For users, the web interface is the main entrance to OGPD. Here, editorial, information and community content can be searched. Over OGPD users get direct access to available online data offers. At the same time you can comment and review.

Creative Commons Lizenzvertrag

This work and content is licensed under a Creative Commons Attribution 3.0 Unported License.

A key aspect of open data is the easy access to them. Data journalists and application developers can tap into data faster and better if they are discoverable in central portals. Centralized data management is hardly feasible beyond administrative and domain boundaries for various reasons (heterogeneous data, distributed competence, conflicts of interest, etc.) and it is not necessarily useful. Therefore, distributed data storage with a central metadata portal is generally a good idea. In a prominent place – like daten.berlin.de – information about and links to the data poviders’ data are collected and presented – in Berlin, for example, the various Senate administrations or, say, city cleaning and transport agencies.

But what is recorded in addition to the name, description and author in the metadata of open datasets? This question arises when capturing the metadata as well as in the automatic exchange of metadata records, known as harvesting. Only if structure and meaning are sufficiently uniform or self-explanatory, a central portal can be realized, in this case for Germany, which unites various data offers and the contents of existing data catalogs.

Consistent metadata is addressed in many domains with different approaches and priorities, such as environmental data or bibliographic data (see OpenGov study, section on metadata). For Open Data it has been best practice in Europe and America to use the metadata structures of CKAN (Comprehensive Knowledge Archive Network) of OKFN. In OpenData CKAN is the de facto standard for data catalog software.

CKAN exchanges metadata in JSON format. The only required field is the name that should be both readable for users and URL friendly, all other fields are optional. The core fields are title, description, resource (ie data files or services), license and contact person. Further details can be stored in a JSON dictionary, i.e. as nested key-value pairs. This focus on the essentials along with great flexibility are likely to be the reason for the spread of this metadata model.

Throughout the development of open data, especially in Berlin and Germany, a desire for more structure became apparent: many data providers and developers wanted precise instructions what information should be persisted in what form. In order to obtain the minimal, flexible character of CKAN and JSON on the one hand and also to clearly define how the metadata should look for OGPD on the other, we develop the JSON scheme for Open Government Data (OGD).

The OGD-metadata structure is maintained on github.com. It is intended not so much as a tool to validate metadata, but rather as a communication tool for those interested, like public decision-makers, data providers, developers and other open data initiatives in the German speaking area. This purpose is also served publishing in early beta stage and publicly transparent development on github.com.

The metadata structure that supports the description of datasets (including data services), as well as documents and applications. This is how it is setup: The most important properties are stored at the top level. These include: title, identifier, description, responsible and terms of use. Furthermore, the list of resources is essential, that is the actual data, documents or applications. The most important property of each resource in turn is their URL. In addition, each resource description and format can be recorded. This configuration allows, for example, to capture related files as one record, possibly for different periods, in different languages or formats. Within the “extras” all other data are stored. These mainly include the temporal and spatial arrangement, and details about the origin of imported items.

On github.com you can find a tabular HTML representation next to the schema and lists of to be used categories and licenses. We are looking forward to comments, suggestions and questions.

Creative Commons Lizenzvertrag

This work and content is licensed under a Creative Commons Attribution 3.0 Unported License.

On 25 September 2012 representatives met from Bavaria, Berlin, Bremen, Hamburg and Baden-Wuerttemberg and from PortalU and GDI-DE at our institute Fraunhofer FOKUS in Berlin to discuss the metadata structure for OGDP. It was also discussed how existing data offerings can be converted into the OGDP.

Harvesting refers to the merging of metadata from different catalogs. As part of OGPD the metadata of that workshop participants as well as from DESTATIS are harvested insofar as they meet the minimum criteria for Open Data: Only those records, documents and applications are accepted which have a freely and available electronic resource, description, and a well-defined license.

For that I discussed the proposed metadata structure. It was re-adjusted especially regarding unique identifier to uniquely trace the origin and detection of duplicates, of dealing with contact details, the detection of open licenses and the geographical coverage. In addition, the main categories were discussed for the classification of datasets, documents and applications, and summarized in the following 14 main categories:

  • Economics and Labour
  • Transport and Traffic
  • Environment and Climate
  • Geography, Geology and Geodata
  • Health
  • Consumer Protection
  • Infrastructure, Construction and Housing
  • Education and Science
  • Public administration, Budget and Tax
  • Law and Justice
  • Social
  • Culture, Leisure, Sport and Tourism
  • Population
  • Politics and Elections

These primary categories are the basic classification and supplemented by specific, for example, subject-specific, sub-categories. For harvesting existing categorizations, such as in INSPIRE or EVAS, are mapped to these 14 categories.

After clarification of the metadata structure as a target to be provided for data, documents and applications, various ways to provide datasets for OGPD were discussed. As a result, four different ways are going to be implemented and offered:

  • Passive providing by CSW, which is used for example for the spatial data catalog and PortalU
  • Passive providing by CKAN / JSON, which is used for example in Berlin, Hamburg and Bremen
  • Active providing by CKAN API, which will be used for example by Bavaria
  • Manually record by form, which is for example used by the Ministry of Finance for the fiscal data

The main result of our harvesting workshop is certainly the revised metadata structure, which is now available on github.com.

Administrating the metadata structure for OGPD on GitHub allows transparent, collaborative development including version control. Change requests can be made public, the history of the metadata structure is documented and the current status is always visible.

Florian Marienfeld and Thomas Scheel just have added an HTML representation of the JSON schema of metadata structure for OGPD that makes the metadata structure more readable and easier to understand.

We look forward to your comments and suggestions to the metadata structure and / or to harvesting – directly in GitHub, but just as happy here.

Creative Commons Lizenzvertrag

This work and content is licensed under a Creative Commons Attribution 3.0 Unported License.

Three words in the title lately guarantee full rows of seats at lectures: Open, Government and Data. The opening of the government and administration is one of the hot topics par exellence. In width “Open Government” opens this issue, based on the opening of data and information the term “Open Data” covers it.

In October, authoritative pioneers of Open Government Data movement met in the German DA-CH-LI region in Vienna at the OGD DA-CH LI conference to share experiences among government, business, science and society. Prof. Ina Schieferdecker and me were part of the Fraunhofer Institute FOKUS site. Generally discussed was the release and delivery of governmental and administrative data, the cross-border interoperable data exchange and harmonization of the OGD-standardization. The presentations and conference proceedings [PDF] are available online.

Also Germany was represented by numerous actors. Dr. Wolfgang Both reported to the state in Berlin, Jan-Ole Beyer argued that open data “is the basis for stronger cooperation and participation – in short: for Open Government”, Prof. Ina Schieferdecker reported on the current status of standardization and metadata and I argued that in addition to all necessary attention to the broader perspective of Open Data Open Government must not be neglected.

Open Government Data in Berlin: experiences and current situation

Dr. Wolfgang Both is tasked for the Senate Department for Economics, Technology and Research in Berlin with setting up the first open data portal in Germany. As an experienced Open Data expert he shared his experiences at the conference after a year of Open Data Portal in Berlin with those present. He particularly emphasized the positive effects of cooperation with the Internet community and the positive experience with the Apps4Berlin competition. Since August 2012 in Berlin convenes an interagency working group Open Data, which will deal with issues in 2013 including continuing education to open data and open government and is working on a government decision to set standards for open data. The latter is scheduled for December 2013. The talk you can find here [PDF].

Open data as a basis for stronger cooperation and participation: the state in Germany

Jan-Ole Beyer from the Ministry of the Interior reported in his presentation [PDF] on the development of Open Government Data in Germany, emphasizing that open data “is the basis for stronger cooperation and participation – in short: for Open Government”. Contrary to momentary confusion on Twitter (Tweet 1 / Tweet 2) he announced the prototype of the Open Government Platform for early 2013. Beyer also stressed concrete measures that need to be addressed. So e.g. the terms of use, unified metadata schema and a strategy for communication and public relations as well as information and training materials for different audiences – “People need to learn what they can do with the data and how this information can be used for studies or in school. ”

Harmonization of metadata: flat the way for transnational access to open data

As an important cross-national debate topic itself the metadata structures were emphasized. Austria puplished in October 2012 version 2.0 of its metadata standards and is currently already working on a version 2.1, which will probably be available here in December. In Project Open Government Platform Germany currently a first German metadata structure is developed. The current state was discussed by Ina Schieferdecker with the DA-CH-Li representatives in a workshop. In her presentation she drew, i.a., a quantitative and qualitative comparison of the Austrian and German approach. Both metadata structures are based on CKAN and have many similarities. Differences exist, for example in the number of required fields (AT: 11 vs. D:. 10) and the optional fields (AT: 20 vs. D:. 17). Qualitatively, they differ inter alia in dealing with dates and contact persons as well as different categories. Schieferdecker concluded, despite the differences, that discrepancies can be overcome, and she was optimistic about a possible harmonization.

Governments and administrations as so-called “black boxes” will no longer endure

I myself in my paper [PDF] argued that Open Government is more than Open (Government) Data – a position that in a similar form also recently was published by the company of computer science in the Memorandum to the opening of state and government [PDF]. Not only the technology has changed, but also the expectations of actions and decision-making processes in the public sector has become another. Therefore, a political-administrative system that works from the outside as incomprehensible black box is not going to endure longer. Rather should the political administrative system be open for participation by third (participation), and against the background of decreasing amount of own resources it should aim for stronger cooperation with society at specific tasks and projects (collaboration). Regarding the traceability (transparency) I do think that it is important not only to make data available, but also information – that is documents – in appropriate formats on open data portals. Many interesting topics for citizens otherwise remain on the sidelines – for example laws. Especially when it comes to Open Data as part of Open Government, in the future the actions between citizens, governments and businesses should be in addition to the technical analysis of the data be much more to the fore. Only this way Open Data and information will be integrated into meaningful participation and collaboration processes and input from third parties can incorporate seamlessly and more value increasing in administrative processes.

The conference series OGD DA-CH-Li is probably in spring 2013 continued in Berlin and in the fall of 2013 in Switzerland.

Creative Commons License

This work and content is licensed under a Creative Commons Attribution 3.0 Unported License.