BibJSON feedback report

(This is a guest page written by Matthew R. Watkins on behalf of the JISC Open Bibliography 2 project.)

Introduction

This report concerns my conversion into BibJSONof bibliographic data compiled between 1998 and 2012 as part of my “Number Theory and Physics Web-archive”, a well-used site hosted by the University of Exeter, UK http://empslocal.ex.ac.uk/people/staff/mrwatkin/zeta/physics.htm.

The site contains annotated bibliographies of material which concerns the (often surprising) interface between number theory and various branches of physics. It contains many dozens of pages and approximately 1800 records. I have converted the 677 bibliographic records which live on the site’s eight main pages:

  • http://empslocal.ex.ac.uk/people/staff/mrwatkin/zeta/physics1.htm
  • http://empslocal.ex.ac.uk/people/staff/mrwatkin/zeta/physics2.htm
  • http://empslocal.ex.ac.uk/people/staff/mrwatkin/zeta/physics8.htm

In many cases these records were partial, fragmentary, incorrect, or out of date (in may cases, arXiv preprints which I archived shortly after there appearance, but failed to update when they were eventually published as journal articles).

My metadata currently looks like this (I’m not sure if this is how it’s intended to be used in the case of each key):

"metadata": {
    "description": "a collection of bibliographic items relevant to the interface between number theory and various aspects of physics", 
    "created": "2012-04-29T16:05:23.055882", 
    "modified": "2012-04-29T16:05:23.055882", 
    "collection": "NTP", 
    "label": "Number Theory and Physics Web-archive", 
    "source": "http://empslocal.ex.ac.uk/people/staff/mrwatkin/zeta/physics.htm", 
    "records": 677, 
    "id": "Number Theory and Physics Web-archive"
}

Method

I began by disaggregating the HTML source files for the webpages in question using a specially tailored Python script which outputs a plain-text format for ease of editing. This format is a simple alternative to BibTeX which I’ve developed, where a typical archive entry like…

G. Sierra, "H = xp with interaction and the Riemann zeros", Nucl. Phys. B 776 (3) (2007) 327–364

…ended up looking like this:

@@ARTICLE
@AUTHOR G. Sierra
@TITLE H = xp with interaction and the Riemann zeros
@JOURNAL Nucl. Phys. B
@VOLUME 776
@YEAR 2007
@NUMBER 3
@PAGES 327--364

I then checked and enriched these records, using a combination of the sites Mathematical Reviews, ZBL Mathematics, arXiv, GoogleScholar, GoogleBooks and Jahrbuch fur Mathematik. As much classification and identifying information as possible was gathered, and an abstract was included whereever possible (rendered in decent TeX, often where the source was a confused mix of text formats and embedded graphics). The example above, post-enrichment, looks like this:

@@ARTICLE MR2338679
@AUTHOR Sierra, Germ\'an
@TITLE $H = xp$ with interaction and the {R}iemann zeros
@JOURNAL Nucl. Phys. B
@FJOURNAL Nuclear Physics B. Particle Physics, Field Theory and Statistical Systems, Physical Mathematics.
@VOLUME 776
@YEAR 2007
@NUMBER 3
@ISSN 0550-3213
@PAGES 327--364
@MRNUMBER MR2338679
@ZBLNUMBER 1200.11099
@ARXIV math-ph/0702034
@MRCLASS 11M26 (11Z05 81Q20 81T17)
@MSC2010CLASS 11Z05; 11M26; 81Q20; 81T17
@DOI 10.1016/j.nuclphysb.2007.03.049
@ABSTRACT Starting from a quantized version of the classical Hamiltonian $H=xp$, we add a non-local interaction which depends on 
two potentials. The model is solved exactly in terms of a Jost like function which is analytic in the complex upper half plane. 
This function vanishes, either on the real axis, corresponding to bound states, or below it, corresponding to resonances. We find 
potentials for which the resonances converge asymptotically toward the average position of the Riemann zeros. These potentials 
realize, at the quantum level, the semiclassical regularization of $H=xp$ proposed by Berry and Keating. Furthermore, a linear 
superposition of them, obtained by the action of integer dilations, yields a Jost function whose real part vanishes at the 
Riemann zeros and whose imaginary part resembles the one of the zeta function. Our results suggest the existence of a quantum 
mechanical model where the Riemann zeros would make a point like spectrum embedded in the continuum. The associated spectral 
interpretation would resolve the emission/absorption debate between Berry--Keating and Connes. Finally, we indicate how our 
results can be extended to the Dirichlet $L$-functions constructed with real characters.
@PACS 02.10.De; 05.45.Mt; 11.10.Hi

I then developed a script to convert these ‘bibtxt’ records to BibJSON. The example above became this:

{   
    "title": "$H = xp$ with interaction and the {R}iemann zeros", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR2338679", 
    "abstract": {
        "text": "Starting from a quantized version of the classical Hamiltonian $H=xp$, we add a non-local interaction which depends on two potentials. The model is solved exactly in terms of a Jost like function which is analytic in the complex upper half plane. This function vanishes, either on the real axis, corresponding to bound states, or below it, corresponding to resonances. We find potentials for which the resonances converge asymptotically toward the average position of the Riemann zeros. These potentials realize, at the quantum level, the semiclassical regularization of $H=xp$ proposed by Berry and Keating. Furthermore, a linear superposition of them, obtained by the action of integer dilations, yields a Jost function whose real part vanishes at the Riemann zeros and whose imaginary part resembles the one of the zeta function. Our results suggest the existence of a quantum mechanical model where the Riemann zeros would make a point like spectrum embedded in the continuum. The associated spectral interpretation would resolve the emission/absorption debate between Berry--Keating and Connes. Finally, we indicate how our results can be extended to the Dirichlet $L$-functions constructed with real characters."
    }, 
    "author": [
        {
            "name": "Sierra, Germ\\'an"
        }
    ], 
    "collection": "NTP", 
    "id": "MR2338679", 
    "owner": "Matthew_R_Watkins", 
    "journal": {
        "volume": "776", 
        "shortcode": "Nucl. Phys. B", 
        "identifier": [
            {
                "type": "issn", 
                "id": "0550-3213"
            }
        ], 
        "issue": {
            "number": "3"
        }, 
        "name": "Nuclear Physics B. Particle Physics, Field Theory and Statistical Systems, Physical Mathematics."
    }, 
    "date": {
        "year": "2007"
    }, 
    "identifier": [
        {
            "url": "http://dx.doi.org/10.1016/j.nuclphysb.2007.03.049", 
            "type": "DOI", 
            "id": "10.1016/j.nuclphysb.2007.03.049"
        }, 
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=1200.11099", 
            "type": "ZMATH", 
            "id": "1200.11099"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=2338679", 
            "type": "MR", 
            "id": "MR2338679"
        }, 
        {
            "url": "http://www.arxiv.org/abs/math-ph/0702034", 
            "type": "arXiv", 
            "id": "math-ph/0702034"
        }
    ], 
    "type": "article", 
    "pages": "327--364", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "11M26", 
                "secondary": [
                    "11Z05", 
                    "81Q20", 
                    "81T17"
                ]
            }
        }, 
        {
            "codes": [
                "11Z05", 
                "11M26", 
                "81Q20", 
                "81T17"
            ], 
            "id": "msc2010class"
        }, 
        {
            "codes": [
                "02.10.De", 
                "05.45.Mt", 
                "11.10.Hi"
            ], 
            "id": "pacs"
        }
    ]
}

There was then some work involved in checking the bibJSON output, adjusting the script and then iterating this process until the output was satisfactory.

Documentation

Internal referencing within the collection

For bibtex types ‘incollection’ and ‘inproceedings’, I introduced a “book” object (dictionary) with title, editor, identifier (containing ZBL, MR numbers, etc as well as ISBNs), volume, edition, but also with “publisher” sub-object and “series” sub-object (“publisher” is a dictionary with keys “name” and “address”; “series” is a dictionary with “name”, “number” (often mistakenly called “volume”, but this is a widespread bibliographic confusion), very occasionally “series volume”, and “issn” (when available) – a “series editor” key could also be added – see below for more on the “series” object.).

When an ‘incollection’-type or ‘inproceedings’-type article is in a book which also appears in the collection, the book object just has the key “id”, whose value is the collection id of the book in quesiton.

{
    "title": "The $10^{22}$-nd zero of the {R}iemann zeta function", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR1868473", 
    "bibtex_type": "inproceedings", 
    "author": [
        "Odlyzko, A. M."
    ], 
    "collection": "NTP", 
    "id": "MR1868473", 
    "abstract": {
        "text": "Recent and ongoing computations of zeros of the Riemann zeta function are described. They include the computation of $10$ billion zeros near zero number $10^22$. These computations verify the Riemann Hypothesis for those zeros, and provide evidence for additional conjectures that relate these zeros to eigenvalues of random matrices."
    }, 
    "book": {
        "id": "MR1868465"
    }, 
    "link": [
        {
            "url": "http://www.dtc.umn.edu/~odlyzko/doc/zeta.10to22.tex", 
            "anchor": "LINK"
        }
    ], 
    "owner": "Matthew_R_Watkins", 
    "identifier": [
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=1022.11042", 
            "type": "ZMATH", 
            "id": "1022.11042"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=1868473", 
            "type": "MR", 
            "id": "MR1868473"
        }
    ], 
    "pages": {
        "start_page": "139", 
        "end_page": "144"
    }, 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "11M26", 
                "secondary": [
                    "11Y35"
                ]
            }
        }
    ]
}

where the

"book": {
    "id": "MR1868465"
}

refers to

{
    "publisher": {
        "name": "American Mathematical Society", 
        "address": "Providence, RI"
    }, 
    "isbn": [
        "9780821820797", 
        "0821820796"
    ], 
    "title": {
        "main_title": "Dynamical, spectral, and arithmetic zeta-functions", 
        "subtitle": "{AMS} special session on dynamical, spectral and arithmetic zeta functions"
    }, 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR1868465", 
    "series": {
        "issn": "0271-4132", 
        "name": "Contemporary Mathematics", 
        "number": "290"
    }, 
    "bibtex_type": "proceedings", 
    "venue": {
        "venue_state": "TX", 
        "venue_days-months": "January 15--16", 
        "venue_string": "San Antonio, TX, January 15--16, 1999", 
        "venue_loc": "San Antonio, TX", 
        "venue_date": "January 15--16, 1999", 
        "venue_year": "1999", 
        "venue_city": "San Antonio"
    }, 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "id": "MR1868465", 
    "keywords": [
        "special session", 
        "proceedings", 
        "AMS", 
        "San Antonio, TX (USA)", 
        "dynamical zeta functions", 
        "spectral zeta functions", 
        "arithmetic zeta functions", 
        "zeta functions"
    ], 
    "date": {
        "year": "2001"
    }, 
    "identifier": [
        {
            "url": "http://dx.doi.org/10.1090/conm/290", 
            "type": "DOI", 
            "id": "10.1090/conm/290"
        }, 
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=0980.00025", 
            "type": "ZMATH", 
            "id": "0980.00025"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=1868465", 
            "type": "MR", 
            "id": "MR1868465"
        }
    ], 
    "pages": "x+195pp.", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "11-06", 
                "secondary": [
                    "37C30"
                ]
            }
        }, 
        {
            "codes": [
                "00B25", 
                "11-06", 
                "37-06", 
                "58-06"
            ], 
            "id": "msc2000class"
        }
    ]
}

The fact that this kind of internal referencing is possible is extremely welcome, as all of the necessary data for the book can be imported as needed, and as there’s no need to replicate it across the collection (where several articles from the same book are included), it avoids the possibilitites of erroneous or missing fields in the replication.

Where the ‘incollection’-type or ‘inproceedings’-type aritcle is from a book NOT in the collection, the ‘book’ object must be filled out as usual, e.g.

{
    "other_relevant_persons": [
        {
            "name": "David, Herbert Aron"
        }
    ], 
    "title": "Moments of {C}auchy order statistics via {R}iemann zeta functions", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR1462444", 
    "abstract": {
        "text": "We obtain exact expressions for the moments of single order statistics from a standard Cauchy distribution. These are expressed as linear combinations of Riemann zeta functions. Using these and numerical integration methods, means of order statistics from samples of sizes up to 25 have been tabulated. Second order moments and variances are then obtained by applying the recurrence relation given by [Barnett, 1966]. They are also tabulated. Finally, we obtain expressions for product moments in terms of means of order statistics and Riemann zeta functions.", 
        "references": [
            {
                "ref_id": "Barnett, 1966", 
                "ref_string": "V.D. Barnett, ``Order statistics estimators of the location of the Cauchy distribution'' [MR0205363]"
            }
        ]
    }, 
    "author": [
        {
            "name": "Joshi, P. C."
        }, 
        {
            "name": "Chakraborty, Sharmishtha"
        }
    ], 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "id": "MR1462444", 
    "note": "Papers in honor of Herbert A. David", 
    "keywords": [
        "second order moments", 
        "Bernoulli numbers", 
        "Bernoulli polynomials", 
        "exact expressions", 
        "moments of single order statistics", 
        "Cauchy distribution", 
        "linear combinations of Riemann zeta functions", 
        "numerical integration", 
        "means of order statistics", 
        "recurrence relation", 
        "product moments"
    ], 
    "book": {
        "publisher": {
            "name": "Springer-Verlag", 
            "address": "Berlin, Germany"
        }, 
        "identifier": [
            {
                "type": "isbn-13", 
                "id": "9780387945910"
            }
        ], 
        "editor": [
            {
                "name": "Nagaraja, Haikady Navada"
            }, 
            {
                "name": "Sen, Pranab Kumar"
            }, 
            {
                "name": "Morrison, Donald F."
            }
        ], 
        "title": "Statistical theory and applications"
    }, 
    "date": {
        "year": "1996"
    }, 
    "identifier": [
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=0846.62038", 
            "type": "ZMATH", 
            "id": "0846.62038"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=1462444", 
            "type": "MR", 
            "id": "MR1462444"
        }
    ], 
    "type": "incollection", 
    "pages": "117--127", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "62G30", 
                "secondary": [
                    "11M06", 
                    "62E15"
                ]
            }
        }

The handling of internal references in abstracts

Sometimes abstracts contain references to items in their article’s bibliography. To deal with this, I standardised the referencing scheme (as “[author, year]”) and treated “abstract” as an object, a dictionary with keys “text” (the abstract itself, with these standardised reference id strings) and “references”, which is itself valued as a dictionary listing the various references, associatin the id strings with useful bibliographic information.

Here are some examples:

"abstract": {
    "text": "In this paper, we construct a natural $C^*$-dynamical system whose partition function is the Riemann $\\zeta$ function. Our construction is general and associates to an inclusion of rings (under a suitable finiteness assumption) an inclusion of discrete groups (the associated $ax+b$ groups) and the corresponding Hecke algebras of bi-invariant functions. The latter algebra is endowed with a canonical one parameter group of automorphisms measuring the lack of normality of the subgroup. The inclusion of rings $\\mathbb{Z}\\subset\\mathbb{Q}$ provides the desired $C^*$-dynamical system, which admits the $\\zeta$ function as partition function and the Galois group $Gal(\\mathbb{Q}^{cycl}/\\mathbb{Q})$ of the cyclotomic extension $\\mathbb{Q}^{cycl}$ of $\\mathbb{Q}$ as symmetry group. Moreover, it exhibits a phase transition with spontaneous symmetry breaking at inverse temperature $\\beta=1$ (cf. [Bost and Connes, 1992]). The original motivation for these results comes from the work of B. Julia [Julia, 1990] (cf. also [Spector, 1990]).", 
    "references": [
        {
            "ref_id": "Julia, 1990", 
            "ref_string": "MR1058473"
        }, 
        {
            "ref_id": "Bost and Connes, 1992", 
            "ref_string": "J.-B. Bost and A. Connes, ``Produits eul\\'eriens et facteurs de type ${III}$'', C. R. Acad. Sci. Paris S\\'er. I Math. 315 (1992), no. 3, 279--284. MR1179720"
        }, 
        {
            "ref_id": "Spector, 1990", 
            "ref_string": "MR1037102"
        }
    ]
}, 

Note that where the item’s in the collection, i just give the collection id for the value of “ref_string”. If it isn’t, I give more complete info (basic author/title at the very least, but sometimes also journal/volume/issue/pages/year) plus the best unique identifier available (MR or ZMATH or arXiv).

Newly introduced keys

I’ve introduced some of my own keys (some of these inspired by Alex Scopan who I’ve worked with on Celebratio Mathematica bibliographies) which aren’t bibtex standard.

“supervisor” applies to PhD theses, as in the following:

{
    "school": "Princeton University", 
    "supervisor": [
        {
            "name": "Sarnak, Peter"
        }
    ], 
    "title": "Evidence for a spectral interpretation of zeros of $L$-functions", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/NTP85", 
    "author": [
        {
            "name": "Rubinstein, Michael"
        }
    ], 
    "collection": "NTP", 
    "owner": "Matthew_R_Watkins", 
    "date": {
        "year": "1998"
    }, 
    "type": "phdthesis", 
    "id": "NTP85"
}

Note that this is handles the same way as “author” and “editor”, so (relatively rare) situations with multiple supervisors can be handled. And as with those other keys, the persons in question can be further associated with Math. Genealogy Project identifiers, ZBL author-cluster identifiers, homepages, or any other relevant information.

“venue” is an object introduce for “proceedings” and “inproceedings” types, as well as “article” and “unpublished” (preprint) types which are stated to have arisen from specific conferences. Here are some examples:

"venue": {
    "venue_days-months": "March 30--April 1", 
    "venue_string": "Glasgow, UK, March 30--April 1, 1993", 
    "venue_loc": "Glasgow, UK", 
    "venue_date": "March 30--April 1, 1993", 
    "venue_year": "1993", 
    "venue_country": "UK", 
    "venue_city": "Glasgow"
}

"venue": {
    "venue_state": "KY", 
    "venue_days-months": "July 20--25", 
    "venue_string": "Lexington, KY, July 20--25, 2009", 
    "venue_loc": "Lexington, KY", 
    "venue_date": "July 20--25, 2009", 
    "venue_year": "2009", 
    "venue_city": "Lexington"
}

"venue": {
    "venue_days-months": "June 15--21", 
    "venue_string": "Varna, Bulgaria, June 15--21, 2009", 
    "venue_loc": "Varna, Bulgaria", 
    "venue_date": "June 15--21, 2009", 
    "venue_year": "2009", 
    "venue_country": "Bulgaria", 
    "venue_city": "Varna"
}

"venue": {
    "venue_string": "Toulouse, France, 1998", 
    "venue_year": "1998", 
    "venue_loc": "Toulouse, France"
}

Clearly this could be further elaborated, with “venue_start_date” and “venue_end_date” strings (or objects), as well as “venue_institution”, although that seems rather extravagant at this stage.

There’s a standard BibTeX “techreport” type, which would typically be handled like this…

{
    "title": "$p$-{A}dic iterations", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/NTP68", 
    "author": [
        {
            "name": "Ben-Menahem, S."
        }
    ], 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "report": {
        "date": {
            "year": "1988"
        }, 
        "number": "TAUP 1627-88", 
        "institution": "Tel-Aviv University"
    }, 
    "ednote": "No ZBL, arXiv, MR", 
    "date": {
        "year": "1988"
    }, 
    "type": "techreport", 
    "id": "NTP68"
}

I introduced the “report” object, which can then also be used for other types, e.g. “article”-type records where a published journal article began life as a technical report, e.g.:

{
    "title": "Vacuum stability, string density of states and the {R}iemann zeta function", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR2820824", 
    "abstract": {
        "text": "We study the distribution of graded degrees of freedom in classically stable\noriented closed string vacua and use the Rankin-Selberg transform to link it to the \fnite\none-loop vacuum energy. In particular, we find that the spectrum of physical excitations not\nonly must enjoy \\textit{asymptotic supersymmetry} but actually, at very large mass, bosonic and\nfermionic states must follow a universal oscillating pattern, whose frequencies are related\nto the zeros of the Riemann $\\zeta$-function. Moreover, the convergence rate of the overall\nnumber of the graded degrees of freedom to the value of the vacuum energy is determined\nby the Riemann hypothesis. We discuss also attempts to obtain constraints in the case of\ntachyon-free open-string theories."
    }, 
    "author": [
        {
            "name": "Angelantonj, Carlo"
        }, 
        {
            "name": "Cardella, Matteo"
        }, 
        {
            "name": "Elitzur, Shmuel"
        }, 
        {
            "name": "Rabinovici, Eliezer"
        }
    ], 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "id": "MR2820824", 
    "keywords": [
        "superstrings and heterotic strings", 
        "superstring vacua"
    ], 
    "report": {
        "date": {
            "year": "2011"
        }, 
        "number": "DFTT 26/2010", 
        "institution": "Department of Physics, University of Torino"
    }, 
    "journal": {
        "volume": "2011", 
        "shortcode": "J. High Energy Phys.", 
        "identifier": [
            {
                "type": "issn", 
                "id": "1126-6708"
            }
        ], 
        "issue": {
            "number": "2"
        }, 
        "name": "Journal of High Energy Physics"
    }, 
    "ednote": "No ZBL number", 
    "date": {
        "year": "2011"
    }, 
    "identifier": [
        {
            "url": "http://dx.doi.org/10.1007/JHEP02(2011)024", 
            "type": "DOI", 
            "id": "10.1007/JHEP02(2011)024"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=2820824", 
            "type": "MR", 
            "id": "MR2820824"
        }, 
        {
            "url": "http://www.arxiv.org/abs/1012.5091", 
            "type": "arXiv", 
            "id": "1012.5091"
        }
    ], 
    "type": "article", 
    "pages": "1--27"
}

Note that the “report” object has a “date” subobject. This could be useful when the date on which the technical report was issued differs from the date of publication of the journal article (the main “date” key for the record).

Another useful key is “other_relevant_persons” which lists anyone associated with the item who isn’t one of the authors, editors, or supervisors. So this could be someone to whom a particular volume or journal issue or article was dedicated, a translator.

Here are a couple of examples:

{
    "other_relevant_persons": [
        {
            "name": "Abad, Julio"
        }
    ], 
    "title": {
        "main_title": "A physics pathway to the {R}iemann hypothesis", 
        "subtitle": "Julio {A}bad, in memoriam"
    }, 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/1012.4264", 
    "abstract": {
        "text": "We present a brief review of the spectral approach to the Riemann hypothesis, according to which the imaginary part of the non trivial zeros of the zeta function are the eigenvalues of the Hamiltonian of a quantum mechanical system."
    }, 
    "author": [
        {
            "name": "Sierra, German"
        }
    ], 
    "collection": "NTP", 
    "owner": "Matthew_R_Watkins", 
    "book": {
        "publisher": {
            "name": "Prensas Universitarias de Zaragoza"
        }, 
        "identifier": [
            {
                "type": "isbn-13", 
                "id": "9788492774043"
            }
        ], 
        "editor": [
            {
                "name": "Asorey Carballeira, M."
            }, 
            {
                "name": "Garc\\'ia Esteve, J. V."
            }, 
            {
                "name": "Ra\\~nada, M. F."
            }, 
            {
                "name": "Sesma, J."
            }
        ], 
        "title": "Mathematical physics and field theory"
    }, 
    "ednote": "Unable to find page numbers, or even verification that the book version was in English.", 
    "date": {
        "year": "2009"
    }, 
    "identifier": [
        {
            "url": "http://www.arxiv.org/abs/1012.4264", 
            "type": "arXiv", 
            "id": "1012.4264"
        }
    ], 
    "type": "incollection", 
    "id": "1012.4264"
}

and

{
    "publisher": {
        "name": "Academic Press", 
        "address": "Orlando, FL"
    }, 
    "other_relevant_persons": [
        {
            "name": "Randol, Burton"
        }, 
        {
            "name": "Dodziuk, Jozef"
        }
    ], 
    "author": [
        {
            "name": "Chavel, Isaac"
        }
    ], 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR0768584", 
    "series": {
        "issn": "0079-8169", 
        "name": "Pure and Applied Mathematics", 
        "number": "115"
    }, 
    "title": "Eigenvalues in {R}iemannian geometry", 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "id": "MR0768584", 
    "note": "Including a chapter by Burton Randol. With an appendix by Jozef Dodziuk.", 
    "keywords": [
        "Laplace operator", 
        "isoperimetric inequalities", 
        "heat equation", 
        "topological perturbations", 
        "constant negative curvature", 
        "Selberg trace formula"
    ], 
    "date": {
        "year": "1984"
    },
    "identifier": [
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=0551.53001", 
            "type": "ZMATH", 
            "id": "0551.53001"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=0768584", 
            "type": "MR", 
            "id": "MR0768584"
        }, 
        {
            "type": "isbn-13", 
            "id": "9780121706401"
        }, 
        {
            "type": "isbn", 
            "id": "0121706400"
        }
    ], 
    "type": "book", 
    "pages": "xiv+362pp.", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "58G25", 
                "secondary": [
                    "35P99", 
                    "53C20"
                ]
            }
        }, 
        {
            "codes": [
                "53-02", 
                "58-02", 
                "53C20", 
                "58J50", 
                "58J60"
            ], 
            "id": "msc2000class"
        }
    ]
}

As “other_relevant_persons” is a dictionary, although it’s currently just being used to store a “name” key, there’s no reason that it can’t also store a “role” key to explain what their relevance is to the record in question. It could be argued that “translator” and “dedication_subjects” keys could be added to introduce a further refinement.

An “issue” object is helpful. In most cases, issue is just a number- that is, conventional article entries with author/title/journal/volume/number/year/pages-type info. But in other cases, it has to deal with a “special” issue with a title, and editors. It might be consist of conference proceedings, or be dedicated to a senior mathematician. It might be specially published by some publisher other than the one which normally publishes the journal. An object is ideal for this situation. Here are some examples of how I’ve used this object in the report.

{
    "title": "Stochastic behavior in quantum scattering", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR0719062", 
    "type": "article", 
    "abstract": {
        "text": "A 2-dimensional smooth orientable, but not compact space of constant negative curvature with the topology of a torus is investigated. It contains an open end, i.e. an exceptional point at infinite distance, through which a particle or a wave can enter or leave, as in the exponential horn of certain antennas or loud-speakers. In the Poincar\\'e model of hyperbolic geometry, the solutions of Schr\\\"odinger's equation for the reflection of a particle which enters through the horn are easily constructed. The scattering phase shift as a function of the momentum is essentially given by the phase angle of Riemann's zeta function on the imaginary axis, at a distance of $\\frac{1}{2}$ from the famous critical line. This phase shift shows all the features of chaos, namely the ability to mimick any given smooth function, and great difficulty in its effective numerical computation. A plot shows the close connection with the zeros of Riemann's zeta function for low values of the momentum (quantum regime) which gets lost only at exceedingly large momenta (classical regime?) Some generalizations of this approach to chaos are mentioned."
    }, 
    "author": [
        {
            "name": "Gutzwiller, M. C."
        }
    ], 
    "collection": "NTP", 
    "id": "MR0719062", 
    "owner": "Matthew_R_Watkins", 
    "journal": {
        "volume": "7", 
        "shortcode": "Phys. D", 
        "identifier": [
            {
                "type": "issn", 
                "id": "0167-2789"
            }
        ], 
        "issue": {
            "publisher": {
                "name": "North-Holland"
            }, 
            "title": "Order in chaos", 
            "editor": [
                {
                    "name": "Campbell, David K."
                }, 
                {
                    "name": "Rose, Harvey"
                }
            ], 
            "number": "1--3"
        }, 
        "name": "Physica D: Nonlinear Phenomena"
    }, 
    "ednote": "Seemingly a special issue of Physica D . No ISBN.", 
    "date": {
        "year": "1983"
    }, 
    "identifier": [
        {
            "url": "http://dx.doi.org/10.1016/0167-2789(83)90138-0", 
            "type": "DOI", 
            "id": "10.1016/0167-2789(83)90138-0"
        }, 
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=0619.10026", 
            "type": "ZMATH", 
            "id": "0619.10026"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=0719062", 
            "type": "MR", 
            "id": "MR0719062"
        }
    ], 
    "issue": {
        "venue": {
            "venue_state": "NM", 
            "venue_days-months": "May 24--28", 
            "venue_string": "Los Alamos, NM, May 24--28, 1982", 
            "venue_loc": "Los Alamos, NM", 
            "venue_date": "May 24--28, 1982", 
            "venue_year": "1982", 
            "venue_city": "Los Alamos"
        }
    }, 
    "pages": "341--355", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "11F72", 
                "secondary": [
                    "11M06", 
                    "20H10", 
                    "58G25", 
                    "81C05"
                ]
            }
        }
    ]
}

{
    "title": "Composite $p$-branes in various dimensions", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR1462161", 
    "type": "article", 
    "abstract": {
        "text": "We review an algebraic method of finding the composite $p$-brane solutions for a generic Lagrangian, in arbitrary spacetime dimension, describing an interaction of a graviton, a dilaton and one or two antisymmetric tensors. We set the Fock--De Donder harmonic gauge for the metric and the ``no-force'' condition for the matter fields. Then equations for the antisymmetric field are reduced to the Laplace equation and the equation of motion for the dilaton and the Einstein equations for the metric are reduced to an algebraic equation. Solutions composed of $n$ constituent $p$-branes with $n$ independent harmonic functions are given. The form of the solutions demonstrates the harmonic functions superposition rule in diverse dimensions. Relations with known solutions in $D=10$ and $D=11$ dimensions are discussed."
    }, 
    "author": [
        {
            "name": "Aref'eva, I. Ya."
        }, 
        {
            "name": "Viswanathan, K. S."
        }, 
        {
            "name": "Volovich, A. I."
        }, 
        {
            "name": "Volovich, I. V."
        }
    ], 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "id": "MR1462161", 
    "report": {
        "date": {
            "year": "1997"
        }, 
        "number": "SMI-26-96; AIV-3-96"
    }, 
    "journal": {
        "volume": "56", 
        "shortcode": "Nucl. Phys., B, Proc. Suppl.", 
        "identifier": [
            {
                "type": "issn", 
                "id": "0920-5632"
            }
        ], 
        "issue": {
            "title": "Theory of elementary particles", 
            "editor": [
                {
                    "name": "L\\\"ust, Dieter"
                }, 
                {
                    "name": "Otto, Hans-J\\\"urgen"
                }
            ], 
            "number": "3"
        }, 
        "name": "Nuclear Physics, B, Proceedings Supplements"
    }, 
    "ednote": "No ISBN available for this special issue.  It's 359pp.", 
    "date": {
        "year": "1997"
    }, 
    "identifier": [
        {
            "url": "http://dx.doi.org/10.1016/S0920-5632(97)00309-5", 
            "type": "DOI", 
            "id": "10.1016/S0920-5632(97)00309-5"
        }, 
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=0957.83547", 
            "type": "ZMATH", 
            "id": "0957.83547"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=1462161", 
            "type": "MR", 
            "id": "MR1462161"
        }
    ], 
    "issue": {
        "venue": {
            "venue_days-months": "August 27--31", 
            "venue_string": "Buckow, Germany, August 27--31, 1996", 
            "venue_loc": "Buckow, Germany", 
            "venue_date": "August 27--31, 1996", 
            "venue_year": "1996", 
            "venue_country": "Germany", 
            "venue_city": "Buckow"
        }
    }, 
    "pages": "52--60", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "83E30", 
                "secondary": [
                    "81T30"
                ]
            }
        }, 
        {
            "codes": [
                "83E30", 
                "81T30"
            ], 
            "id": "msc2000class"
        }
    ]
}

{
    "title": "The primes contain arbitrarily long arithmetic progressions", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR2415379", 
    "abstract": {
        "text": "We prove that there are arbitrarily long arithmetic progressions of primes. There are three major ingredients. The first is Szemer\\'edi's theorem, which asserts that any subset of the integers of positive density contains progressions of arbitrary length. The second, which is the main new ingredient of this paper, is a certain transference principle. This allows us to deduce from Szemer\\'edi's theorem that any subset of a sufficiently pseudorandom set (or measure) of positive \\textit{relative} density contains progressions of arbitrary length. The third ingredient is a recent result of Goldston and Yildirim, which we reproduce here. Using this, one may place (a large fraction of) the primes inside a pseudorandom set of ``almost primes'' (or more precisely, a pseudorandom measure concentrated on almost primes) with positive relative density."
    }, 
    "author": [
        {
            "name": "Green, Ben"
        }, 
        {
            "name": "Tao, Terence"
        }
    ], 
    "collection": "NTP", 
    "id": "MR2415379", 
    "owner": "Matthew_R_Watkins", 
    "journal": {
        "volume": "167", 
        "shortcode": "Ann. Math.", 
        "identifier": [
            {
                "type": "issn", 
                "id": "0003-486X"
            }
        ], 
        "issue": {
            "number": "2"
        }, 
        "name": "Annals of Mathematics"
    }, 
    "link": [
        {
            "url": "http://annals.math.princeton.edu/wp-content/uploads/annals-v167-n2-p03.pdf", 
            "anchor": "LINK"
        }
    ], 
    "ednote": "No DOI seems to be available", 
    "date": {
        "year": "2008"
    }, 
    "identifier": [
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=1191.11025", 
            "type": "ZMATH", 
            "id": "1191.11025"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=2415379", 
            "type": "MR", 
            "id": "MR2415379"
        }, 
        {
            "url": "http://www.arxiv.org/abs/math/0404188", 
            "type": "arXiv", 
            "id": "math/0404188"
        }
    ], 
    "type": "article", 
    "pages": "481--547", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "11N13", 
                "secondary": [
                    "11A41", 
                    "11B25", 
                    "37A45"
                ]
            }
        }, 
        {
            "codes": [
                "11N13", 
                "37A45", 
                "11B25", 
                "11A41"
            ], 
            "id": "msc2000class"
        }
    ]
}, 
{
    "other_relevant_persons": [
        {
            "name": "Coates, John H."
        }
    ], 
    "title": "Obstructions to uniformity, and arithmetic patterns in the primes", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR2251475", 
    "abstract": {
        "text": "In this expository article, we describe the recent approach, motivated by ergodic theory, towards detecting arithmetic patterns in the primes, and in particular establishing that the primes contain arbitrarily long arithmetic progressions. One of the driving philosophies is to identify precisely what the obstructions could be that prevent the primes (or any other set) from behaving ``randomly'', and then ei ther show that the obstructions do not actually occur, or else convert the obstructions into usable structural information on the primes."
    }, 
    "author": [
        {
            "name": "Tao, Terence"
        }
    ], 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "id": "MR2251475", 
    "keywords": [
        "prime numbers", 
        "additive problems"
    ], 
    "journal": {
        "volume": "2", 
        "shortcode": "Pure Appl. Math. Q.", 
        "identifier": [
            {
                "type": "issn", 
                "id": "1558-8599"
            }
        ], 
        "issue": {
            "number": "2", 
            "title": "Special issue: in honor of {J}ohn {H}. {C}oates, part 2"
        }, 
        "name": "Pure and Applied Mathematics Quarterly"
    }, 
    "link": [
        {
            "url": "http://intlpress.com/JPAMQ/p/2006/395-433.pdf", 
            "anchor": "LINK"
        }
    ], 
    "ednote": "seemingly no DOI available", 
    "date": {
        "year": "2006"
    }, 
    "identifier": [
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=1105.11032", 
            "type": "ZMATH", 
            "id": "1105.11032"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=2251475", 
            "type": "MR", 
            "id": "MR2251475"
        }, 
        {
            "url": "http://www.arxiv.org/abs/math/0505402", 
            "type": "arXiv", 
            "id": "math/0505402"
        }
    ], 
    "type": "article", 
    "pages": "395--433", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "11N13", 
                "secondary": [
                    "11B25", 
                    "37A45"
                ]
            }
        }, 
        {
            "codes": [
                "11N13", 
                "11B25", 
                "374A5"
            ], 
            "id": "mscclass"
        }, 
        {
            "codes": [
                "11N13", 
                "11B25", 
                "37A45"
            ], 
            "id": "msc2000class"
        }
    ]
}

Handling of subject classification codes

My “subject” object is designed to store unlimited classification data on the item in question. I found many different schemes in use while enriching my records (MR, MSC, MSC1991, MSC2000, AMS, AMS1991, PACS), each with its own format. Subject is a list of dictionaries, each with an “id” key identifying the type of classification scheme, and a “code_dict” key (a dictionary able to distinguish “primary” and “secondary” codes) or a “codes” key (just a loist of strings when no such distinction is involved). See any of the examples above.

There’s also a “keywords” key (currently just a list of strings) which supplements this, and which could arguably be somehow integrated into the “subject” object.

Name formatting

I could further refine my conversion scripts so that they process author names in such a way that

{"name":"Watkins, Matthew R."}

would end up as

{"name":"Watkins, Matthew R.",
"name_string":"Matthew R. Watkins",
"name_first":"Matthew",
"name_last":"Watkins",
"name_middle":"R."}

with the added possibility of including lists of alternate spellings (for people whose names have diacritics, spaces, hyphens or are regularly misspelled, etc.)? I’m aware that ZMATH has “author name clusters” where variant names are in circulation, these might be useful if someone wanted to expand in that direction

Series volume

As mentioned above, what is often refered to as the “volume” of a series would be more accurately considered a “number” by bibliographers. A number in such a series could then consist of multiple volumes. In some unusal cases, e.g.. Memoirs of the AMS, there’s a series volume AND series number, while the book itself has multiple volumes. To avoid confusion, in introduced a “volume” key in the “series” object (which normally contains “name”, “number”, “issn” (where available), and could also include “series editor”). Here’s that example:

{
    "publisher": {
        "name": "American Mathematical Society", 
        "address": "Providence, RI"
    }, 
    "title": "Selberg trace formula", 
    "url": "http://bibsoup.net/Matthew_R_Watkins/NTP/MR0705744", 
    "series": {
        "volume": "44", 
        "issn": "0065-9266", 
        "name": "Memoirs of the American Mathematical Society", 
        "number": "283"
    }, 
    "author": [
        {
            "name": "Osborne, M. Scott"
        }, 
        {
            "name": "Warner, Garth"
        }
    ], 
    "owner": "Matthew_R_Watkins", 
    "collection": "NTP", 
    "id": "MR0705744", 
    "volume": "III: Inner product formulae (Initial considerations)", 
    "keywords": [
        "Eisenstein series", 
        "Selberg trace formula", 
        "residual forms", 
        "reductive Lie group", 
        "non-uniform lattice"
    ], 
    "ednote": "Front cover says ``Volume 44, Number 283'', so this is an ``edge case''", 
    "date": {
        "year": "1983"
    }, 
    "identifier": [
        {
            "url": "http://www.zentralblatt-math.org/zmath/en/search/?an=0515.22014", 
            "type": "ZMATH", 
            "id": "0515.22014"
        }, 
        {
            "url": "http://www.ams.org/mathscinet-getitem?mr=0705744", 
            "type": "MR", 
            "id": "MR0705744"
        }, 
        {
            "type": "isbn-13", 
            "id": "9780821822838"
        }, 
        {
            "type": "isbn", 
            "id": "0821822837"
        }
    ], 
    "type": "book", 
    "pages": "iv+209pp.", 
    "subject": [
        {
            "id": "mrclass", 
            "code_dict": {
                "primary": "22E40", 
                "secondary": [
                    "11F70", 
                    "11F72"
                ]
            }
        }, 
        {
            "codes": [
                "22E50", 
                "22E30", 
                "22E40", 
                "22E46", 
                "43A80"
            ], 
            "id": "msc2000class"
        }
    ]
}

Finding more articles with multiple report nos. could perhaps be handled as a list of report objects?
If possible where there are multiple report nos. separated by semicolons, i try to find multiple report institutions to match, include them in REPORT_INST separated by semicolons. Script could then match them up into mini-dictionaries bundled into a list, presumably

Other journal identifiers beside “issn”

Typically, we have, for a “journal” object:

"journal": {
    "volume": "43", 
    "shortcode": "Duke Math. J.", 
    "identifier": [
        {
            "type": "issn", 
            "id": "0012-7094"
        }
    ], 
    "issue": {
        "number": "3"
    }, 
    "name": "Duke Mathematics Journal"
}

As I’ve only used ISSN for identifying journals in the collection, for a while I was considering changing this to the simpler

"journal": {
    "volume": "43", 
    "shortcode": "Duke Math. J.", 
    "issn": "0012-7094",
    "issue": {
        "number": "3"
    }, 
    "name": "Duke Mathematics Journal"
}

But as it’s structured, the “journal” objecty to allow multiple types of identifiers for journals. ISSN is most common, but there’s also CODEN, etc. And others may come along which would be helpful to add.

Flatness

I seem to recall that the documenation at bibjson.org suggested that BibJSON should be kept as “flat” as possible, avoiding too much nesting of data (I can’t currently) find the relevant passage. I realise now that I may be guilt of overzealous nesting practices, but would happily adjust my scripts to flatten things out according to some definitive guidance, once standards start to crystalise. The nesting aspect of BibJSON is its most attractive feature, I find, but I can also see how one might get carried away with it!

Leave a Reply

Your email address will not be published. Required fields are marked *