We are happy to announce the recent publication of article data in medicine and life sciences by the German National Library of Medicine (ZB MED). Two datasets have been published in August and registered at CKAN/the Data Hub:
- CC MED (Current Contents Medicine): data about 650,000 journal articles gathered since 2000 from 650 German or German-speaking journals in medical and health-related fields (see the CKAN/Data Hub entry and the general information page about the data). 90% of this data isn’t part of the MedLine dataset.
- CC Green (Current Contents Nutrition. Environment. Agriculture.): data about 9,000 journal articles gathered sind January 2011 from 200 German or German-speaking journals in applied life sciences (see CKAN/Data Hub entry and here).
The ZB MED produces this data by scanning the journals and extracting the information from the OCR’ed text, in part manually. Until now it is only used in the discovery service MEDPILOT. Currently, the open data is provided in the export format of the SISIS library system. A format documentation will follow. It shouldn’t be too hard, though, to understand the relevant fields. Anybody up for converting this to Linked Open Data?
Journals descriptions in lobid.org
Regarding the journal data itself (the least granular description level), for each referenced journal there is – since the recent update – a URI with an RDF description as Linked Open Data in lobid.org. Just take the identifier of the German “Zeitschriftendatenbank” (ZDB) (in field “2599.001″) and add it to “http://lobid.org/resource/ZDB” and you’ll get the lobid.org-description. This will help if people are interested in using the data in a LOD environment. Especially it would be useful to merge this with the MedLine data…
- ZDB-ID in first record of CC MED example data is ’200772-1′
- The corresponding lobid.org-URI based on the ZDB-ID is http://lobid.org/resource/ZDB200772-1.