Europeana and Linked Open Data

Europeana has recently released a new version of its Linked Data Pilot, data.europeana.eu. We now publish data for 2.4 million objects under an open metadata licence: CC0, the Creative Commons Public Domain Dedication. This post elaborates on this earlier one by Naomi.

The interest of Europeana for Linked Open Data

Europeana aims to provide the widest access possible to the European cultural heritage massively published through digital resources by hundreds of musea, libraries and archives. This includes empowering other actors to build services that contribute to such access. Making data openly available to the public and private sectors alike is thus central to Europeana’s business strategy. We are also trying to provide a better service by making available richer data than the one very often published by cultural institutions. Data where millions of texts, images, videos and sounds are linked to other relevant resources: persons, places, concepts…

Europeana has therefore been interested for a while in Linked Data, as a technology that facilitates these objectives. We entirely subscribe to the views expressed in the W3C Library Linked Data report, which shows the benefits (but also acknowledges the challenges) of Linked Data for the cultural sector.

Europeana’s first toe in the Linked Data water

Last year, we released a first Linked Data pilot at data.europeana.eu. This has been a very exciting moment, a first opportunity for us to play with Linked Data.

We could deploy our prototype relatively easily and the whole experience was extremely valuable, from a technical perspective. In particular, this has been the first large-scale implementation of Europeana’s new approach to metadata, the Europeana Data Model (EDM). This model enables the representation of much richer data compared to the current format used by Europeana in its production service. First, our pilot could use EDM’s ability to represent several perspectives over a cultural object. We have used it to distinguish the original metadata our providers send us, from the data that we add ourselves. Among the Europeana data there are indeed enrichments that are created automatically and are not checked by professional data curators. For trust purposes, it is important that data consumers can see the difference.

We could also better highlight a part of Europeana’s added value as a central point for accessing digitized cultural material, in direct connection with the above mentioned enrichment. Europeana indeed employs semantic extraction tools that connect its objects with large multilingual reference resources available as Linked Data, in particular Geonames and GEMET. This new metadata allows us to deliver a better search service, especially in a European context. With the Linked Data pilot we could explicitly point at them, in the same environment they are published in. We hope this will help the entire community to better recognize the importance of these sources, and continue to provide authority resources in interoperable Linked Data format, using for example the SKOS vocabulary.

If you are interested in more lessons learnt from a technical perspective, we have published more of them in a technical paper at the Dublin Core conference last year. Among the less positive aspects, data.europeana.eu is still not part of the production system behind the main europeana.eu portal. It does not come with the guarantee of service we would like to offer for the linked data server, though the provision of data dumps is not impacted by this.

Making progress on Open Data

Another downside is that data.europeana.eu publishes data only for a subset of the objects the our main portal provides access to. We started with 3.5 million objects over a total of 20 millions. These were selected after a call for volunteers, to which only few providers answered. Additionally, we could not release our metadata under fully Open terms. This was clearly an obstacle to the re-use of our data.

After several months we have thus released a second version of data.europeana.eu. Though still a pilot, it nows contain fully open metadata (CC0).

The new version concerns an even smaller subset of our collections: in February 2012, data.europeana.eu contains metadata on 2.4 million objects. But this must be considered in context. The qualitative step of fully open publication is crucial to us. And over the past year, we have started an active campaign to convince our community of opening up their metadata, allowing everyone to make it work harder for the benefits of end users. The current metadata served at data.europeana come from data providers who have reacted early and positively to our efforts. We trust we will be able to make metadata available for many more objects in the coming year.

In fact we hope that this Linked Open Data pilot can contribute a part of our Open Data advocacy message. We believe such technology can trigger third parties to develop innovative applications and services, stimulating end users’ interest for digitized heritage. This would of course help to convince more partners to contribute metadata openly in the future. We have released next to our new pilot an animation that conveys exactly this message, you can view it here.

For additional information about access to and technical details of the dataset, see data.europeana.eu and our entry on the Data Hub.

This entry was posted in Data, guest post, LOD-LAM, Semantic Web. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *