Today (11th jan 2012) I am attending the JISC Discovery 2012 meeting to learn about the JISC projects aiming to increase open access to research materials. Here are some notes I took during the day:

Joy Palmer – Mimas – Copac

Search for records relevant in some way to each other, visualise and export as MODS or CSV. Would it be worth exporting (bib)JSON? Would they be interested in that?

Christine Madsen – Bodleian Libraries

They have collections they want to make more usable to machines and people, and to get rid of silos e.g. allow the content to be aggregated by others – Europeana etc. They will provide some metadata, OAI, Open Linked Data, via APIs. Using the commercial iNQUIRE interface from Armadillo Systems for people to search the repo. (iNQUIRE runs over a SOLR index, similar to project blacklight – but is it open source?)

Me – JISC Open Biblio 2

We are working on building interfaces for people to quickly share their collections in useful ways – how can we make sure it is easy to share and consume this metadata?

John Gilby – M25 Consortium

Reusing Copac records and comparing with M25 libraries to build higher quality bibliographic metadata – using open source. Developing an API for embedding in resource discovery systems. practical guidelines for open metadata principles. Should show them our principles

Contextual Wrappers

Using collection descriptions to help people find relevant collections. Using Culture Grid for metadata. Working with people like University Museums in Scotland to share collection descriptions.

Eric Cross, Stephen McGough – Newcastle University – The Cutting Edge

Bringing together archaeological and ethnographic museum objects – sharp-edged objects such as tools, axes etc as the focus. For performing use-wear analysis on the objects. Building a comprehensive metadata internet collection about such objects. Spanning multiple collections and searching across them using SPARQL queries, presenting a single front end.

Leif Isaksen – Southampton – Pelagios 2

Moved from JISC Geo – enabling linked ancient Geo data in open systems, focussing on lightweight annotation approach. Looking for connections between documents that are related by place, mapping and visualising the connections. Building a cataloguing, search, visualisation service and community toolkit.

Step change

Building systems to generate linked data from archivist workflows, so they do not need to care about RDF themselves. Analyses descriptions against OpenCalais. Aiming to connect it to Calm, so that archivists can locally generate linked data and push it to their Calm UIs. Similarly doing this with Historypin – who generate UIs to show interesting historical things near a certain geographic area. Working in context with Cumbria archive service. launching an API via UKAT.

Edina Geotagger, MediaHub and SUNCAT

Providing a webs ervice around Exif|Tool, to geotag / geocode image, audio, and video metadata. MediaHub provides images, video and audio licensed by JISC collections and harvested from other providers. can parse records and identify probably people, places and dates – then suggest values to users. Hoping to generate better participation by just asking users “is this a date” and so on, rather than them having to type into a form. SUNCAT aggregates journal holdings info from 82 UK libraries, as MarcXML and linked data. Aiming to increase the amount of metadata that is openly available. Working to establish use cases for formats such as MODS / DC, and exploring the licensing status of RDF triples.

ServiceCORE

provides a jquery plugin for an institutional repo that shows relevant information about similar entries in other repos elsewhere. Aiming to build a web service layer providing programmable access to aggregated content and metadata in institutional repos. Also a pilot tool for automatic subject-based classification of content using text categorisation techniques. overall, providing an enhanced related resource discovery system based on text mining. Hoping to offer a way for people to find versions of papers that they can access, as opposed to ones they cannot. Building on OAI-PMH compliant repos.

Clock – Lincoln and Cambridge

This is a continuation of Jerome. Enriched bib data from Jerome, Comet and elsewhere. A set of dev APIs and linked data endpoints. Aiming to establish a distributed scholarly catalogue for the UK. Planning to work closely with JISC open bib.

LUNCH!

Afternoon session – grouping up around particular themes

RDF
sharing data
authorities/indexes/access points
User interfaces
Collection descriptions across A + H

I joined the sharing data group. Dicsussion turned out to be quite brief, and just covered “what is the problem of sharing data”? There was no time to come up with a response.

Studies in Discovery

The Discovery ecology is aiming to explain why institutions should do it / take part in it. It will do this by writing up case studies. Doing the 12 criteria of Discovery will mean that you are “doing” discovery:

adopting open licensing
requiring clear reasonable terms and conditions
using easily understood data models
deploying persistent identifiers
establishing data relationships by re-using authoritative identifiers
providing clear mechanisms for accessing APIs
documenting APIs
adopting widely understood data formats
ensuring data is sustainable
ensuring services are supported
using your own APIs
collecting data to measure use

Business case

How is our project to be sustainable? What will maintain it? How will it survive long term, beyond JISC funding? Need to provide feedback to JISC Discovery about what suitable business cases are.

Group discussions

We considered how the 12 studies listed above had already been or would be considered by our projects. In some cases, some of the study topics were not relevant, but on the whole they are useful. We will keep these topics in mind whilst writing up project blog posts, and may well write specific posts about those topics.

Of particular interest to me was the “adopting widely understood data formats” study – because this goes beyond the scope of any one project, but is also something that would be of benefit. However, whether or not it is of benefit depends on whether or not any two or more people / groups decides it is of benefit… I will follow up to the discovery mailing list with information about our current thinking on bibjson, with details about our parsers API (once I have finished it), and links to the metadata guide that has been created by Primavera and others.

That is the end. I will follow up with some projects after today, and also we are already meeting up with CLOCK Cambridge and Lincoln. Next Discovery meeting will be in April, then in July.

JISC Discovery 2012