We are happy to announce Glottolog/Langdoc, a comprehensive knowledgbase of references treating (mostly) underdescribed languages from the whole world.
Glottolog/Langdoc is built upon a collation of 20 source bibliographies covering the whole world, from Alaska to Australia. The original bibliographies were parsed and enriched with machine learning techniques. This allows to formulate queries such as
and combinations thereof such as
Furthermore, an areally and genalogically balanced sample can be drawn.
All references have their own URIs. All resources are available as xhtml and rdf, and can be downloaded as bib, html, txt, or via Zotero. Dumps of references are available as a very large *bib and as a dump in rdf+xml. Glottolog/Langdoc content is made available under CC-BY-NC. Intercultural issues upstream unfortunately prevent us from releasing the content under a more permissive license.
Glottolog/Langdoc uses DCMI, BIBO, FRBR, and ISBD ontologies to provide an interoperable resource. Glottolog is part of the Linguistic Linked Open Data cloud. We are working on a SPARQL endpoint, which will probably be made available in June this year.
We are happy to contribute our bibliographical resource to the Linked Data cloud and would welcome feedback under firstname.lastname@example.org.