Put it in RDF to solve all your problems!
As with most things in life, the reality is often a little more complex. If you are old enough, you may well remember when this very same cry was often uttered, but with ‘RDF’ above replaced by ‘XML’ or if you are older still, ‘SGML’.
We haven’t quite reached the tipping point with bibliographic data in RDF so that a defacto model and structure has clearly emerged. There are plenty of contenders though, each based on differing models for how this data should be encapsulated in RDF. The main characteristic difference is in how markedly hierarchical or flat the model structure is.
A model that has emerged from the library world is FRBR – Functional Requirements for Bibliographic Records. From wikipedia:
FRBR is a conceptual entity-relationship model developed by the International Federation of Library Associations and Institutions (IFLA) that relates user tasks of retrieval and access in online library catalogues and bibliographic databases from a user’s perspective. It represents a more holistic approach to retrieval and access as the relationships between the entities provide links to navigate through the hierarchy of relationships.
There are plenty of articles and documents online to explain further, so I will not take up your time with a summary of it, just my opinion. FRBR is very much built around the notion of books – what a book is, taking into account things like editions and so on. Where FRBR really does fall down a rabbit’s hole, is the consideration of things like serials and journal articles. Their treatment feels very much like an afterthought and the philosophical ideas of Work and Expression get very much more murky, especially when considering linking these records to conference papers and blog posts by the same article authors.
There is enough of a model, however, to render an understandable bibliographic ‘record’ for an article in RDF, and this post will give an example of this, using David Shotton and Silvio Peroni’s FaBIO ontology to encapsulate the information in a FRBR-like manner.
The data used comes from an IUCr paper “Nicotinamide-2,2,2-trifluoroethanol (2/1)” Acta Cryst. (2009). E65, o727-o728, which has RDF embedded in the HTML page itself. The original RDF looks something like this:
@prefix dc: <http://purl.org/dc/elements/1.1/>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix prism: <http://prismstandard.org/namespaces/1.2/basic/>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
<doi:10.1107/S1600536809007594>
prism:eissn "1600-5368";
prism:endingpage "728";
prism:issn "1600-5368";
prism:number "4";
prism:publicationdate "2009-04-01";
prism:publicationname "Acta Crystallographica Section E: Structure Reports Online";
prism:rightsagent "med@iucr.org";
prism:section "organic compounds";
prism:startingpage "727";
prism:volume "65";
dc:creator "Bardin, J.",
"Florence, A.J.",
"Johnston, B.F.",
"Kennedy, A.R.",
"Wong, L.V.";
dc:date "2009-04-01";
dc:description "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present.";
dc:identifier _9:S1600536809007594;
dc:language "en";
dc:link <http://scripts.iucr.org/cgi-bin/paper?fl2234>;
dc:publisher "International Union of Crystallography";
dc:rights <http://creativecommons.org/licenses/by/2.0/uk>;
dc:source <urn:issn:1600-5368>;
dc:subject "";
dc:title "Nicotinamide-2,2,2-trifluoroethanol (2/1)";
dc:type "text";
dcterms:abstract "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present.".
This bibliographic information rendered into a FaBIO model (amongst other ontologies):
@prefix fabio: <http://purl.org/spar/fabio/> .
@prefix c4o: <http://purl.org/spar/c4o/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .
@prefix prism: <http://prismstandard.org/namespaces/basic/2.0/> .
:article
a fabio:JournalArticle
; dc:title "Nicotinamide-2,2,2-trifluoroethanol (2/1)"
; dcterms:creator [ a foaf:Person ; foaf:name "Johnston, B.F." ]
; dcterms:creator [ a foaf:Person ; foaf:name "Florence, A.J." ]
; dcterms:creator [ a foaf:Person ; foaf:name "Bardin, J." ]
; dcterms:creator [ a foaf:Person ; foaf:name "Kennedy, A.R." ]
; dcterms:creator [ a foaf:Person ; foaf:name "Wong, L.V." ]
; dc:rights <http://creativecommons.org/licenses/by/2.0/uk>
; dc:language "en"
; fabio:hasPublicationYear "2009"
; fabio:publicationDate "2009-04-01"
; frbr:embodiment :printedArticle , :webArticle
; frbr:partOf :issue
; fabio:doi "10.1107/S1600536809007594"
; frbr:part :abstract
; prism:rightsagent "med@iucr.org" .
:abstract
a fabio:Abstract
; c4o:hasContent "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present." .
:printedArticle
a fabio:PrintObject
; prism:pageRange "727-728" .
:webArticle
a fabio:WebPage
; fabio:hasURL "http://scripts.iucr.org/cgi-bin/paper?fl2234" .
:volume
a fabio:JournalVolume
; prism:volume "65"
; frbr:partOf :journal .
:issue
a fabio:JournalIssue
; prism:issueIdentifier "4"
; frbr:partOf :volume
:journal
a fabio:Journal
; dc:title "Acta Crystallographica Section E: Structure Reports Online"
; fabio:hasShortTitle "Acta Cryst. E"
; dcterms:publisher [ a foaf:Organization ; foaf:name "International Union of Crystallography" ]
; fabio:issn "1600-5368" .
The most obvious model and ontology that has emerged for describing bibliographic metadata in RDF is the Bibliographic Ontology, developed by Frédérick Giasson and Bruce D’Arcus and has been in existence for long enough to gain acceptance by a number of other projects, such as EPrints, Talis Aspire and Chronicling America (The Chronicling America website at the Library of Congress provides a view on millions of page of digitized newspaper content from around the United States.)
The same data again, rendered this time using BIBO’s model and ontology, rather than a FRBR-like one:
@prefix bibo: <http://purl.org/ontology/bibo/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .
@prefix prism: <http://prismstandard.org/namespaces/basic/2.0/> .
<info:doi:10.1107/S1600536809007594>
a bibo:Article
; dc:title "Nicotinamide-2,2,2-trifluoroethanol (2/1)"
; dc:isPartOf <urn:issn:16005368>
; bibo:volume "65"
; bibo:issue "4"
; bibo:pageStart "727"
; bibo:pageEnd "728"
; dc:creator :author1
; dc:creator :author2
; dc:creator :author3
; dc:creator :author4
; dc:creator :author5
; bibo:authorList (:author1 :author2 :author3 :author4 :author5)
; dc:rights <http://creativecommons.org/licenses/by/2.0/uk>
; dc:language "en"
; dc:date "2009-04-01"
; bibo:doi "10.1107/S1600536809007594"
; bibo:abstract "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present."
; prism:rightsagent "med@iucr.org" .
<urn:issn:16005368>
a bibo:Journal
; dc:title "Acta Crystallographica Section E: Structure Reports Online"@en ;
; bibo:shortTitle "Acta Cryst. E"@en
; bibo:issn "1600-5368" .
:author1
a foaf:Person
; foaf:name "Johnston, B.F." .
:author2
a foaf:Person
; foaf:name "Florence, A.J." .
:author3
a foaf:Person
; foaf:name "Bardin, J." .
:author4
a foaf:Person
; foaf:name "Kennedy, A.R."
:author5
a foaf:Person
; foaf:name "Wong, L.V."
Comments on which is the most useable, the most understandable and what is likely to be the better model for sharing this data with other people are most welcome. This is an area in which the community will have to chose a model, as practically, wrapping the information in any of the models is straightforward, but if you put it into a model that noone uses, the model becomes more of a data coffin, than a useful concept to use.