Open Bibliography and Open Bibliographic Data » announcement http://openbiblio.net Open Bibliographic Data Working Group of the Open Knowledge Foundation Tue, 08 May 2018 15:46:25 +0000 en-US hourly 1 http://wordpress.org/?v=4.3.1 BibServer new functionality http://openbiblio.net/2012/03/19/bibserver-new-functionality/ http://openbiblio.net/2012/03/19/bibserver-new-functionality/#comments Mon, 19 Mar 2012 22:22:49 +0000 http://openbiblio.net/?p=2477 Continue reading ]]> During the sprint last week we made a lot of progress with the new functionality for version 0.5.0 – however, Etienne and I got so excited by some new ideas that we did not finish on time; apologies for the delay.

We will be making the new version available over the course of this week, and will have it up and running on http://bibsoup.net soon.

Below is an overview of the new functionality you can expect to see over the course of the next week; we will write some blog posts about the various new capabilities, and this will tie in with the focus of the next sprint – doing docs, tests and issues (no new functionality).

  • editing of records and collections
  • merging collections from multiple sources
  • adding notes to records
  • much improved search UI
  • embed images in search results
  • better visualisation of collections
  • embeddable UI into other web pages via javascript
  • asynchronous parsing – you don’t have to hang on the page waiting for it to complete
  • feedback tickets from asynchronous parses
  • sharing collection admin rights with other users
  • new parser for NLM XML
  • new parser concept – search term gets pages from wikipedia, pulls citations from pages
  • capability to accept and run parsers written in different programming languages
  • browse site users
]]>
http://openbiblio.net/2012/03/19/bibserver-new-functionality/feed/ 0
Medline dataset http://openbiblio.net/2011/05/23/medline-dataset/ http://openbiblio.net/2011/05/23/medline-dataset/#comments Mon, 23 May 2011 09:56:55 +0000 http://openbiblio.net/?p=1120 Continue reading ]]> Announcing the CC0 Medline dataset

We are happy to report that we now have a full, clean public domain (CC0) version of the Medline dataset available for use by the community.

What is the Medline dataset?

The Medline dataset is a subset of bibliographic metadata covering approximately 98% of all PubMed publications. The dataset comes as a package of approximately 653 XML files, chronologically listing records in terms of the date the record was created. There are approximately 19 million publication records.

Medline is a maintained dataset, and updates chronologically append to the current dataset.

Read our explanation of the different PubMed datasets for further information.

Where to get it

The raw dataset can be downloaded from CKAN : http://ckan.net/package/medline

What is in a record

Most records contain useful non-copyrightable bibliographic metadata such as author, title, journal, PubMed record ID. Many also have DOIs. We have stripped out any potentially copyrightable material such as abstracts.

Read our technical description of a record for further information.

Sample usage

We have made an online visualisation of a sample of the Medline dataset – however the visualisation relies on WebGL which is not yet widely supported by all browsers. It should work in Chrome and probably FireFox4.

This is just one example, but shows what great things we can build and learn from when we have open access to the necessary data to do so.

]]>
http://openbiblio.net/2011/05/23/medline-dataset/feed/ 3
Open Data from Dortmund University Library http://openbiblio.net/2011/03/07/open-data-from-dortmund-university-library/ http://openbiblio.net/2011/03/07/open-data-from-dortmund-university-library/#comments Mon, 07 Mar 2011 20:01:56 +0000 http://openbiblio.net/?p=813 Continue reading ]]> Dortmund University Library catalog data has been opened up on March 1st, 2011. As announced in the libraries blog 1,2 Million bibliographic records are released into the public domain under a CC0 waiver.

The data is released in cooperation with the North Rhine-Westphalian Library Service Center (hbz) and adds to the already opened data by other libraries of the hbz network (see the overview of open data from the hbz library network).

The announcement furthermore states:

With this release we take a first step to Linked Open Data. Data from different sources, e.g. also catalog data, will be integrated in a web of data which is also called “Semantic Web”. Dortmund University Library’s data has first to be converted into web-compliant “Linked Open Data”. This happens in cooperation with the hbz.

The data was already converted to RDF using the Bibliographic Ontology (Bibo) and is now part of the Linked Open Bibliographic Data service lobid.org.

Questions and feedback regarding the data release can be sent to opendata@ub.tu-dortmund.de.

]]>
http://openbiblio.net/2011/03/07/open-data-from-dortmund-university-library/feed/ 0
Introducing OFS – a python "bucket"/object storage library http://openbiblio.net/2010/09/09/introducing-ofs-a-python-bucketobject-storage-library/ http://openbiblio.net/2010/09/09/introducing-ofs-a-python-bucketobject-storage-library/#comments Thu, 09 Sep 2010 13:57:52 +0000 http://openbiblio.net/?p=175 Continue reading ]]> Many internally distributed storage systems – such as Amazon’s S3 service or Riak’s key-value architecture –  have similarities in the manner in which data is labelled and subsequently retrieved. This is often because the systems themselves use a distributed hash table or a similar distribution algorithm to disperse and then later find the data they store.

OFS is a python library that seeks to capitalise on their similarities – providing a single, general API to put and get files from one of these services while hiding the specifics of the implementation from the user. This allows for local testing and development before transitioning to using one of the cloud services, services which typically cost real money and slows down testing due to the necessity of communicating with these services over an internet connection.

Characteristics of OFS:

  • Uses a ‘bucket/label’ mechanism to identify individual files
  • Provides a list of content in a given bucket (as best as that the service can provide)
  • Provides per-file metadata in so far as the service can provide (key-value or JSON encode-able data)
  • Current backend plugins:
    • Local storage – based on the pairtree specification that optimises file-distribution across a native file-system to handle large quantities of files. Uses JSON to encode arbitrary metadata about the files in a given bucket.
    • Remote storage (S3 and Archive plugins written by Friedrich Lindenberg (pudo) who has also made large contributions to the codebase):
      • Amazon S3
      • Archive.org
      • Riak (in progress)
    • Also in progress – a REST Client by Friedrich Lindenberg (pudo)
    • One key desire is to provide opaque sharding – breaking up very large files to spread across buckets or even systems to improve performance and the range of services or backend systems OFS can make use of.

It is plain that having the ability to write storage code in a common way, but make use of local as well as remote ‘cloud’ storage is of a great benefit. It encourages file storage to be codified in a distribute-able manner so that scaling later on is easier.

This is a work in progress, but the local implementation is intended to be both a reference implementation as well as useful testing or even production backend for storage. Other backends potentially will have less comprehensive metadata support for individual files, but these ‘limits’ will be included as optional warnings or exceptions once we have a handle on what they are.

Please comment or give feedback on this library. Also, we would welcome any patches for other backend support to the library!

http://bitbucket.org/okfn/ofs

]]>
http://openbiblio.net/2010/09/09/introducing-ofs-a-python-bucketobject-storage-library/feed/ 1