Open Bibliography and Open Bibliographic Data

JISC OpenBibliography: CUL data release

Posted on October 5, 2010 by Mark MacGillivray

The JISC OpenBibliography project has received agreement from Cambridge University Library to provide a set of bibliographic data under an open license (ODC PDDL). This is great news for the project and the wider community, and we will be working to make good example use of this data in the near future.

License details

Open Data Commons Public Domain Dedication and License:
http://www.opendatacommons.org/licenses/pddl/

This Public Domain License places no restrictions at all on what users may do with the data.

Data

This dataset consists of MARC 21 output in a single file, comprising around 180000 records. There is some work to be done to tidy up the data – details of this are available at: http://knowledgeforge.net/pdw/trac/wiki/datatriage. New datasets will be shared as they become available.

We have created a CKAN package for the dataset:

http://ckan.net/package/jiscopenbib-cul-1

The raw data can also be downloaded directly from the following URL:

http://storage.ckan.net/openbiblio/CUL_dataset1-20100705

Posted in JISC OpenBib | Tagged inf11, jisc, jiscEXPO, jiscopenbib, progressPosts, WIN | 5 Comments

Data Triage Notes

Posted on September 22, 2010 by benosteen

I’ve begun to write up my experiences and notes on the triage of the datasets I am processing for the JISC Open Bibliography and Citation projects, in a way that others might make sense of them.

You can find the WIP writeup here: http://knowledgeforge.net/pdw/trac/wiki/datatriage

This will include links to the source datasets and any subsequent curated data as I am able to put them up online.

Posted in JISC OpenBib | Tagged inf11, jisc, jiscEXPO, jiscopenbib, outputs, progress | 1 Comment

Disambiguation, deduplication and 'ideals'

Posted on September 22, 2010 by benosteen

(NB Republished from a mailing list conversation at http://lists.okfn.org/pipermail/open-bibliography/2010-August/000397.html – follow this link to see the comments and replies)

In my work on meshing bibliographic datasets together, I’ve been using a
conceptual tool that I would like to hear views on.

I am creating nodes for the ideals of things on records – whether that is
for people, journals or even the bibliographic document itself. The ideal
represents the best and most complete data for that thing – something we’ll
never really achieve, but that’s not the point. This ideal serves as a node,
a hook, on which we can join up records which describe the same thing
(person, frbr manifestation, etc) but which have differing data for.

It’s easy to consider it for ‘deduplications’ of say article references.
Consider two records, one from the ris feed from pubmed and one from a
citation in a plos article. These are found to be references to the same
article but as you can expect they differ, not just in terms of data but
also on terms of the source or author of that reference.

The way I am tackling this is by creating a node for the ideal bibliographic
reference each aspires to and when dupes are believed to be found, these
ideal nodes are joined into a bundle using sameas (in a different store) and
this bundle has some provenance triples recording the how when and why for
this merging (using open provenance model verbs/classes)

Eg:

:bibrec —> record node from pubmed

:citerec —> plos record

_i suffix —> ideal node

running analyser on record suggests two records are dupes, with a certain
confidence score from a certain weighted matching (call this ‘heur.v0.13’)

Create ideal nodes Just In Time:

:bibrec hasIdeal :bibrec_i
:citerec hasIdeal :citerec_I

Make the bundle:

:b1 a Bundle
   sameas :bibrec_i
   sameas :citerec_I
   opmv:wasGeneratedBy :p1
   created: 2010-08-......

:p1 a opmv:Process
  Opmv:controlledBy :Ben
  Opmv:used :bibrec
  Opmv:used :citerec

:confidence a ConfidenceReport
  Opmv:wasGeneratedBy :p1
  Hasreport <url of doc>  # for time being

This structure let’s me create an aggregated rdf dataset with the best guess
ideal records at any one time. Also, bundles can be merged later if required
creating a tree structure – the top bundle instance and the ‘leaf’ records
form a congruent closure and are thus exportable as such without the admin
structure triples necessary for ongoing maintenance. The bundle notion comes
from the excellent work by the team at southampton, including Hugh glazer,
Ian milliard et al (google for coreference on the semantic web)

Using this technique for entities like people is actually very similar. If I
use the words ‘person’ and ‘persona’ for the ideal and the data in a record
respectively. The persona can have alternative spellings, and time-dependant
details like a fleeting institutional affiliation, and so on. The
(difficult) trick is spotting when two persona’s refer to the same person
but the process for merging is the same even if the creation of an
aggregated record for each is different.

Posted in JISC OpenBib | Tagged inf11, jisc, jiscopenbib, model, progress, rdf | Leave a comment

Bibliographic models in RDF

Posted on September 10, 2010 by benosteen

Put it in RDF to solve all your problems!

As with most things in life, the reality is often a little more complex. If you are old enough, you may well remember when this very same cry was often uttered, but with ‘RDF’ above replaced by ‘XML’ or if you are older still, ‘SGML’.

We haven’t quite reached the tipping point with bibliographic data in RDF so that a defacto model and structure has clearly emerged. There are plenty of contenders though, each based on differing models for how this data should be encapsulated in RDF. The main characteristic difference is in how markedly hierarchical or flat the model structure is.

A model that has emerged from the library world is FRBR – Functional Requirements for Bibliographic Records. From wikipedia:

FRBR is a conceptual entity-relationship model developed by the International Federation of Library Associations and Institutions (IFLA) that relates user tasks of retrieval and access in online library catalogues and bibliographic databases from a user’s perspective. It represents a more holistic approach to retrieval and access as the relationships between the entities provide links to navigate through the hierarchy of relationships.

There are plenty of articles and documents online to explain further, so I will not take up your time with a summary of it, just my opinion. FRBR is very much built around the notion of books – what a book is, taking into account things like editions and so on. Where FRBR really does fall down a rabbit’s hole, is the consideration of things like serials and journal articles. Their treatment feels very much like an afterthought and the philosophical ideas of Work and Expression get very much more murky, especially when considering linking these records to conference papers and blog posts by the same article authors.

There is enough of a model, however, to render an understandable bibliographic ‘record’ for an article in RDF, and this post will give an example of this, using David Shotton and Silvio Peroni’s FaBIO ontology to encapsulate the information in a FRBR-like manner.

The data used comes from an IUCr paper “Nicotinamide-2,2,2-trifluoroethanol (2/1)” Acta Cryst. (2009). E65, o727-o728, which has RDF embedded in the HTML page itself. The original RDF looks something like this:

@prefix dc: <http://purl.org/dc/elements/1.1/>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix prism: <http://prismstandard.org/namespaces/1.2/basic/>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

<doi:10.1107/S1600536809007594>
     prism:eissn "1600-5368";
     prism:endingpage "728";
     prism:issn "1600-5368";
     prism:number "4";
     prism:publicationdate "2009-04-01";
     prism:publicationname "Acta Crystallographica Section E: Structure Reports Online";
     prism:rightsagent "med@iucr.org";
     prism:section "organic compounds";
     prism:startingpage "727";
     prism:volume "65";
     dc:creator "Bardin, J.",
         "Florence, A.J.",
         "Johnston, B.F.",
         "Kennedy, A.R.",
         "Wong, L.V.";
     dc:date "2009-04-01";
     dc:description "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present.";
     dc:identifier _9:S1600536809007594;
     dc:language "en";
     dc:link <http://scripts.iucr.org/cgi-bin/paper?fl2234>;
     dc:publisher "International Union of Crystallography";
     dc:rights <http://creativecommons.org/licenses/by/2.0/uk>;
     dc:source <urn:issn:1600-5368>;
     dc:subject "";
     dc:title "Nicotinamide-2,2,2-trifluoroethanol (2/1)";
     dc:type "text";
     dcterms:abstract "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present.".

This bibliographic information rendered into a FaBIO model (amongst other ontologies):

@prefix fabio: <http://purl.org/spar/fabio/> .
@prefix c4o: <http://purl.org/spar/c4o/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .
@prefix prism: <http://prismstandard.org/namespaces/basic/2.0/> .

:article
    a fabio:JournalArticle
    ; dc:title "Nicotinamide-2,2,2-trifluoroethanol (2/1)"
    ; dcterms:creator [ a foaf:Person ; foaf:name "Johnston, B.F." ]
    ; dcterms:creator [ a foaf:Person ; foaf:name "Florence, A.J." ]
    ; dcterms:creator [ a foaf:Person ; foaf:name "Bardin, J." ]
    ; dcterms:creator [ a foaf:Person ; foaf:name "Kennedy, A.R." ]
    ; dcterms:creator [ a foaf:Person ; foaf:name "Wong, L.V." ]
    ; dc:rights <http://creativecommons.org/licenses/by/2.0/uk>
    ; dc:language "en"
    ; fabio:hasPublicationYear "2009"
    ; fabio:publicationDate "2009-04-01"
    ; frbr:embodiment :printedArticle , :webArticle
    ; frbr:partOf :issue
    ; fabio:doi "10.1107/S1600536809007594"
    ; frbr:part :abstract
    ; prism:rightsagent "med@iucr.org" .

:abstract
    a fabio:Abstract
    ; c4o:hasContent "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present." .

:printedArticle
    a fabio:PrintObject
    ; prism:pageRange "727-728" .

:webArticle
    a fabio:WebPage
    ; fabio:hasURL "http://scripts.iucr.org/cgi-bin/paper?fl2234" .

:volume
    a fabio:JournalVolume
    ; prism:volume "65"
    ; frbr:partOf :journal .

:issue
    a fabio:JournalIssue
    ; prism:issueIdentifier "4"
    ; frbr:partOf :volume

:journal
    a fabio:Journal
    ; dc:title "Acta Crystallographica Section E: Structure Reports Online"
    ; fabio:hasShortTitle "Acta Cryst. E"
    ; dcterms:publisher [ a foaf:Organization ; foaf:name "International Union of Crystallography" ]
    ; fabio:issn "1600-5368" .

The most obvious model and ontology that has emerged for describing bibliographic metadata in RDF is the Bibliographic Ontology, developed by Frédérick Giasson and Bruce D’Arcus and has been in existence for long enough to gain acceptance by a number of other projects, such as EPrints, Talis Aspire and Chronicling America (The Chronicling America website at the Library of Congress provides a view on millions of page of digitized newspaper content from around the United States.)

The same data again, rendered this time using BIBO’s model and ontology, rather than a FRBR-like one:

@prefix bibo: <http://purl.org/ontology/bibo/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .
@prefix prism: <http://prismstandard.org/namespaces/basic/2.0/> .

<info:doi:10.1107/S1600536809007594>
    a bibo:Article
    ; dc:title "Nicotinamide-2,2,2-trifluoroethanol (2/1)"
    ; dc:isPartOf <urn:issn:16005368>
    ; bibo:volume "65"
    ; bibo:issue "4"
    ; bibo:pageStart "727"
    ; bibo:pageEnd "728"
    ; dc:creator :author1
    ; dc:creator :author2
    ; dc:creator :author3
    ; dc:creator :author4
    ; dc:creator :author5
    ; bibo:authorList (:author1 :author2 :author3 :author4 :author5)
    ; dc:rights <http://creativecommons.org/licenses/by/2.0/uk>
    ; dc:language "en"
    ; dc:date "2009-04-01"
    ; bibo:doi "10.1107/S1600536809007594"
    ; bibo:abstract "The nicotinamide (NA) molecules of the title compound, 2C6H6N2O.C2H3F3O, form centrosymmetric R22(8) hydrogen-bonded dimers via N-H...O contacts. The asymmetric unit contains two molecules of NA and one trifluoroethanol molecule disordered over two sites of equal occupancy. The packing consists of alternating layers of nicotinamide dimers and disordered 2,2,2-trifluoroethanol molecules stacking in the c-axis direction. Intramolecular C-H...O and intermolecular N-H...N, O-H...N, C-H...N, C-H...O and C-H...F interactions are present."
    ; prism:rightsagent "med@iucr.org" .

<urn:issn:16005368>
    a bibo:Journal
    ; dc:title "Acta Crystallographica Section E: Structure Reports Online"@en ;
    ; bibo:shortTitle "Acta Cryst. E"@en
    ; bibo:issn "1600-5368" .

:author1
    a foaf:Person
    ; foaf:name "Johnston, B.F." .

:author2
    a foaf:Person
    ; foaf:name "Florence, A.J." .

:author3
    a foaf:Person
    ; foaf:name "Bardin, J." .

:author4
    a foaf:Person
    ; foaf:name "Kennedy, A.R."

:author5
    a foaf:Person
    ; foaf:name "Wong, L.V."

Comments on which is the most useable, the most understandable and what is likely to be the better model for sharing this data with other people are most welcome. This is an area in which the community will have to chose a model, as practically, wrapping the information in any of the models is straightforward, but if you put it into a model that noone uses, the model becomes more of a data coffin, than a useful concept to use.

Posted in JISC OpenBib | Tagged bibliographic, inf11, jiscEXPO, jiscopenbib, model, ontology, progress, rdf | 7 Comments

Open Bibliographic Data Flyer

Posted on September 10, 2010 by Adrian Pohl

The OKFN Working Group on Open Bibliographic Data created this draft version of an Open Bibliographic Data flye. It shows libraries the benefits of Open Data and appeals to them to open their data up.

The work on this flyer is taking place at this Etherpad. Go there for immediate corrections and use this post’s comment section for further suggestions and discussion.

Open up bibliographic data!

Over the past few years, open licensing has facilitated the explosive growth of a ‘knowledge commons’. To give a few prominent examples: Open Access journals, Open Educational Resources and Open Data in scientific research have all been enabled by licenses which permit material to be freely re-used and re-distributed.
Bibliographic records are a key part of our shared cultural heritage. They too should therefore be open, that is made available to the public for access and re-use under an open license which permits use and reuse without restriction [2].

Stimulating cooperation within and beyond the library world

Opening up bibliographic data is a logical extension of the ideals of cooperation and sharing that have been a constant between libraries and library networks for more than a century.
Library institutions are among the first to benefit from a liberalization of bibliographic data (see below). But publishing open bibliographic data not only enhances cooperation within the traditional library community, it also enables usage by non-library institutions such as Wikipedia or the Internet Archive, thus opening up new vistas for cooperation.
Moreover, the enhancements that result from collaboration with a wider community can be fed back into library catalogues and other traditional library resourcees enhancing their value while reducing costs.

Advantages for libraries

But what advantages does Open Data have for libraries? We firmly believe that the release of publicly funded bibliographic data is a priority task for libraries and library networks, which can achieve four main goals:

Ensuring the fullest possible use is made of the wealth of bibliographic data
Maintaining and increasing the relevance of library institutions by increasing their visibility in the World Wide Web and the Semantic Web
Broadening participation in bibliographic data creation, enhancement, and correction
Enabling creation of new services/applications utilizing bibliographic data generating benefits both for researchers and library users and the wider community

Catalogue enrichment with Open Data

The spread of Open Data practice has direct benefits for the library world: for example, it facilitates the enrichment of individual catalogues with information provided by other libraries, e.g. keywords and classification numbers. This improves the search functionality of the catalogue, thus benefitting both librarians and users. At the same time, it also facilitates the enrichment of catalogues from non-library sources (publishers, booksellers, book fan sites) as well as from the scholarly community.

Increasing the visibility of libraries in the Web

The visibility of data can be vastly increased by free publication, as it is no longer accessible solely via the catalogue. The data will be transferred from the library silo, a part of the “deep web”, into the visible web and can be directly detected by Google and others. As a result non-library sites can also link to library resources, thus increasing the visibility of the libraries and their services.

Advantages for library users

Besides optimizing the research tools for bibliographic resources opening library data will allow users to more easily integrate library metadata into their own work.

Easier creation of bibliographies
Better integration with teaching functions
Increased integration between libraries and research
Development of new tools and services whose features and benefits we now anticipate!

[1] Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities (2003) http://oa.mpg.de/openaccess-berlin/berlindeclaration.html

[2] http://www.opendefinition.org/

Posted in Data | 3 Comments

Minutes: 3rd Virtual Meeting of the Open Bibliography Group

Posted on September 10, 2010 by Adrian Pohl

Participants

Adrian Pohl
John Mark Ockerbloom
Rufus Pollock
Peter Murray-Rust

Apologies

Micah Altman

Minutes of former meetings

For the minutes of former meetings see

1st meeting: http://okfnpad.org/Qv1jMMQL1h
2nd meeting: http://okfnpad.org/wTnZOOxiRz

Agenda

Working group blog/website

We decided to use http://openbiblio.net/ as the Working Group’s blog. In the future we’ll publish minutes and informations about ongoing activities there.

We also reminded ourselves to use the group’s hashtag (#openbiblio) more frequently on twitter etc.

Flyer

We did some more work on the Open Bibliographic Data flyer. It’s now posted on the blog and we solicit wider comments.

Principles for Open Bibliographic Data

see http://okfnpad.org/openbibliography-principles

We did some more work on this text and decided to not only adress libraries but also publishers and other institutions producing bibliographic data. More discussion is encouraged but please move it into the discussion part at the bottom of the document.

Debrief from #jiscopenbib

We had no time left for this. So it will happen on list.

Infos from the lld-xg?

Nobody from the lld-xg attended the meeting.

Upcoming: Events and projects

John Mark Ockerbloom hopes to promote open data at BooksOnline10 in Toronto (Oct 26)
http://research.microsoft.com/en-us/events/booksonline10/

Posted in minutes, OKFN Openbiblio, Uncategorized | 1 Comment

Introducing OFS – a python "bucket"/object storage library

Posted on September 9, 2010 by benosteen

Many internally distributed storage systems – such as Amazon’s S3 service or Riak’s key-value architecture – have similarities in the manner in which data is labelled and subsequently retrieved. This is often because the systems themselves use a distributed hash table or a similar distribution algorithm to disperse and then later find the data they store.

OFS is a python library that seeks to capitalise on their similarities – providing a single, general API to put and get files from one of these services while hiding the specifics of the implementation from the user. This allows for local testing and development before transitioning to using one of the cloud services, services which typically cost real money and slows down testing due to the necessity of communicating with these services over an internet connection.

Characteristics of OFS:

Uses a ‘bucket/label’ mechanism to identify individual files
Provides a list of content in a given bucket (as best as that the service can provide)
Provides per-file metadata in so far as the service can provide (key-value or JSON encode-able data)
Current backend plugins:
- Local storage – based on the pairtree specification that optimises file-distribution across a native file-system to handle large quantities of files. Uses JSON to encode arbitrary metadata about the files in a given bucket.
- Remote storage (S3 and Archive plugins written by Friedrich Lindenberg (pudo) who has also made large contributions to the codebase):
  - Amazon S3
  - Archive.org
  - Riak (in progress)
- Also in progress – a REST Client by Friedrich Lindenberg (pudo)
- One key desire is to provide opaque sharding – breaking up very large files to spread across buckets or even systems to improve performance and the range of services or backend systems OFS can make use of.

It is plain that having the ability to write storage code in a common way, but make use of local as well as remote ‘cloud’ storage is of a great benefit. It encourages file storage to be codified in a distribute-able manner so that scaling later on is easier.

This is a work in progress, but the local implementation is intended to be both a reference implementation as well as useful testing or even production backend for storage. Other backends potentially will have less comprehensive metadata support for individual files, but these ‘limits’ will be included as optional warnings or exceptions once we have a handle on what they are.

Please comment or give feedback on this library. Also, we would welcome any patches for other backend support to the library!

http://bitbucket.org/okfn/ofs

Posted in JISC OpenBib | Tagged announcement, inf11, jisc, jiscEXPO, jiscopenbib, software | 1 Comment

JISC OpenBibliography: Projected Timeline, Workplan & Overall Project Methodology

Posted on August 31, 2010 by Mark MacGillivray

The JISC OpenBiblio project is scheduled to run from 1st July 2010 to 31st March 2011. During that time, the project will run 2 week iterative development cycles, each including (for links to trac, code repository, wiki etc see project resources page):

Weekly meetings
Technical lead reports on development since last meeting; incomplete functionality is moved into the next development cycle or abandoned (if at the end of a cycle)
Advocacy lead reports on development since last meeting; team should be updated about recent advocacy successes and about events soon to take place.
Team discuss and decide technical developments for next development cycle
A project blog post should be written each time a success occurs, and referenced to a deliverable listed under the work packages in the JISC project bid.
A project blog post should be written describing obstacles causing delay to any functionality aims or advocacy successes.
Team members should identify topics raised via the mailing lists that need further consideration. This is to manage how mailing list discussions become documented parts of the project; anything from the mailing list that becomes significant to the project should be documented in a blog post / comment / trac task as appropriate.
Team members keep notes in the meeting minutes document.
Technical lead (or others) updates trac as necessary to keep note of technical aims, successes and failures. The trac should be viewed as a resource for the technical lead to report to the team, and as a source of information for writing up the project report, but not as a project management tool
OKF project wiki and JISC expo spreadsheet should be updated as required to reflect any changes in location of key documents

Rather than aiming to develop a specific product, the aim is to develop as much useful output as possible before the project deadline, ideally meeting or exceeding the deliverables defined under the work packages in the JISC project bid.

This attitude is suitable due to the nature of the project – a significant amount of advocacy is required to convince data publishers of the benefits of open access to bibliographic information, and this work in itself should not be overlooked. Therefore, achievements in attaining open agreements will lead to further development opportunities. Overall success can be measured against three strands:

Publicity / advocacy successes – e.g. a good response at a conference to a discussion of the project goals.
Agreements to provide open data – when data providers actually commit to allowing access to their datasets; this is a specific achievement over and above those in point 1.
Technical developments – with access to open data sets, develop examples of how they can be put to valuable use for the community; this should feed back into point 1, leading to more of point 2, and so on.

Posted in JISC OpenBib | Tagged inf11, jisc, jiscEXPO, jiscopenbib, projectMethodology, projectPlan, timeline | Leave a comment

JISC OpenBibliography: Project Team Relationships and End User Engagement.

Posted on August 31, 2010 by benosteen

The JISC OpenBibliography project team:

Peter Murray-Rust of the University of Cambridge Unilever Centre, Departmentt of Chemistry is a reader in Molecular Informatics with nearly 200 publications. Peter will focus on the project direction and the co-ordination of the project partners, and will contribute major software enhancements to his JUMBO library for converting CIF to RDF.
Dr. Rufus Pollock will contribute to most areas of work, focusing on the project direction, management and dissemination. He will also help develop the metadata design, store architecture and disambiguation work. He is a co-founder and board member of the Open Knowledge Foundation with extensive experience of the legal, social and technical aspect of open information and bibliographic data in particular. He has also worked extensively with bibliographic metadata including the full Cambridge University Library catalogue and on developing databases, processing of bibliographic formats (including MARC), and matching of entities from different datasets.
Ben O’Steen has 13 years IT development experience and has most recently worked at the Oxford University Library Service as the software architect for the Bodleian Library’s DAMS (Digital Asset Management System). Extensive experience working with RDF, bibliographic and related metadata standards and distributed system design. He was part of the winning team of Repository Challenge 08 with an entry that provided a RDF Linked Data view on two of the leading opensource repository systems. O’Steen will contribute to most areas including metadata design and realisation, triple storage and SPARQL endpoints and user-facing interfaces and query systems.

Posted in JISC OpenBib | Tagged inf11, jiscEXPO, jiscopenbib, projectPlan, projectTeam, users | Leave a comment

Begginnings of an Object Description Mapper

Posted on August 21, 2010 by Open Knowledge International

The analogue to an Object-Relational Mapper for RDF. Helping to make OWL Description Logic accessible from Python in a way that will seem familiar to people who are accustomed to things like SQLAlchemy and SuRF.

http://packages.python.org/ordf/odm.html

Posted in Uncategorized | Leave a comment