Minutes: 17th Virtual Meeting of the OKFN Openbiblio Group

Date: January, 3rd 2011, 16:00 GMT

Channels: Meeting was held via Skype and Etherpad


  • Adrian Pohl
  • Peter Murray-Rust
  • Thad Guidry
  • Thomas Krichel (first half)
  • Karen Coyle (first half)
  • Jim Pitman (second half)


OCLC’s FAST release

  • Is this open data according to the OKD? – Yes, see VoID description of the dataset
  • The licence is attached to the whole dataset and not to individual resources
  • Attribution probably is required on data set level too
  • Until now, the data isn’t fully OKD-compliant because no full dump of the data exists
  • Richard Cyganiak already created an entry on the Data hub for FAST data.

LCSH in Freebase

  • thad reports that all LCSH will be imported into freebase.
  • The facets will be separated out as well (as FAST does).
  • Timeline: six months to a year

Re-Using DBLP data

  • DBLP entry at the Data Hub: http://thedatahub.org/dataset/dblp
  • Jim had some ideas about re-using the DBLP data in BibSoup, e.g. for collecting publications on deduplication.
  • The isitopendata-enquiry is now resolved by Thomas
  • Mark could run DBLP data in a BibServer instance but he cannot maintain it besides the other BibServer instances he already maintains
  • Thad: DBLP data should be uploaded to ScraperWiki (1 GB maximum)
  • Thad: maybe create a “base” at Freebase for DBLP or BibSoup.
    • Freebase is fully versioned: edits can be reverted; queries can be run against older versions
    • Thad: Freebase is more a backend data store but doesn’t have a proper GUI. BibServer might play the GUI role.
  • ACTION: Mark helps Thad to set up a BibServer instance and Thad pushes the DBLP data into it

Writing and maintaining parsers etc. for BibServer

  • Jim can write preliminary parsers but he can’t maintain the code, write bugfixes etc.
  • Parsers are written in Python
  • He would like to have someone else to do it.
  • Thad proposes pushing the code to scraper wiki
  • ACTION: Jim and Thad will communicate about how to pack up the code, publish and maintain it.

Jim’s recent efforts

Openbiblio Sprint

  • 17-19th of January: openbiblio sprint session in Cambridge? (suggested by Mark)
  • Mark tries to get Etienne (new programmer from Amsterdam), Rufus, Ed Chamberlain (for some of the time), Naomi and Primavera together for a sprint session.

Action Collection

  • Mark, Thad: set up a BibServer instance and push DBLP data into it
  • Jim, Thad: communicate about how to pack up the code for parsers etc., how to publish and maintain it.
This entry was posted in minutes, OKFN Openbiblio. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *