User:Forbes72/doi

From Wikipedia, the free encyclopedia

Why good citations are important[edit]

This page mainly concerns Wikipedia citations as used in physics. For a general background about academic citations, see: Wikipedia:Scientific citation guidelines. This provides a good goal to work toward. In practice, many citations fall short of the standard. Some examples: while it is a common practice in print journals to use ISO-4 abbreviations in citations, this can be confusing to a general audience, and is discouraged on Wikipedia. If a publisher hosts a journal article, it is generally better to link to it through a digital object identifier rather than a simple url, because the identifier has more resistance to link rot. Sometimes articles are cited without a title, requiring them to be located by volume and page. This can lead to a lot of confusion. (e.g. a citation may use translated name of a non-English journal, but there may be an English language journal with the same name) The goal of the citation is to make verifiability as easy as possible.

Research supported by the Wikimedia foundation (see Meta:Research:Characterizing Wikipedia Citation Usage) has found click-through rates on citations are relatively low (less than 1%). However, the quality of the citations significantly improves the usefulness of the encyclopedia for those who do click through. Clearly marking sources which are open-access sources in particular seems likely to increase the number of people who click through to sources themselves.

Pros and cons of doi[edit]

Where it exists, the most generally useful identifier to include in a citation is the digital object identifier. This has a bunch of advantages, being machine-readable, unambiguous, unique, and providing some access to the source itself. There is a lot of useful general information about current use of doi citations at Wikipedia:WikiProject Academic Journals/Journals cited by Wikipedia. There are two main issues with the doi though.

Missing doi[edit]

First, that not every article has a doi. While the vast majority of English-language peer reviewed articles published today have a doi for every article, the coverage becomes much more spotty for both foreign language and older articles. A typical example is older volumes of the Journal of Experimental and Theoretical Physics. From 1997 forward the journal is published by Springer, and so volumes 84 and up all have dois. The older articles are in fact freely available [1], (in both Russian and English translation) but apparently lack a doi. Many journals that lack dois are just very small journals: e.g. Revista Mexicana de Física [2] or the Romanian Journal of Physics [3].

Paywalled doi[edit]

Second, the doi may link to a paywalled resource, when free alternatives are available. For newer articles, arXiv often provides free access, while for articles with expired copyright, repositories such the Internet Archive or HathiTrust are often useful. So for paywalled articles, it is good to check if the article is freely available somewhere. However, this not an immediate concern, as simply adding the doi can make it easier for a bot to come through and add a link to a free resource later.

As an example, consider the automatically generated reference:

Langmuir, Irving (1913-12-01). "The Effect of Space Charge and Residual Gases on Thermionic Currents in High Vacuum". Physical Review. 2 (6). American Physical Society (APS): 450–486. doi:10.1103/physrev.2.450. ISSN 0031-899X.

It correctly identifies the source article with a doi, but it seems like access to the abstract and text of the article depends on whether the reader has permission from the publisher to access the journal. However, in this case, the article is older than the 95 year term of copyright from the date of initial publication. [4] This means the article is now public domain. Thus, we can add in a url from Hathi trust, which has archival scans of many academic journals, and periodically removes the access restrictions to them as copyright expires. In this case, the reference is:

Langmuir, Irving (1913-12-01). "The Effect of Space Charge and Residual Gases on Thermionic Currents in High Vacuum". Physical Review. 2 (6). American Physical Society (APS): 450–486. doi:10.1103/physrev.2.450. ISSN 0031-899X.

Citer and semi-automated references[edit]

Filling out a cite journal template can be done by hand, but it is time-consuming and prone to error. If a doi exists, the citer tool usually makes citing sources significantly easier. This is open source, at github. On the back-end, it pulls metadata from Crossref and reformats it for Wikipedia. Simply putting in a doi usually gives a usable full citation which can be copy-pasted into the body of a Wikipedia article. However, the system has some quirks that mean it is usually advisable to check the output before copying it blindly. Journal citations are generally more consistently correct than conference citations or book citations, which very often have incorrect or missing information

Page/Article field mismatch[edit]

  • Articles published by American Physical Society and Scientific reports are often missing page numbers. This happens when the metadata contains an "article-number" field rather than the usual "page" field that citer uses in citations.
10.1103/PhysRevA.66.032111
10.1103/PhysRevB.63.104420
10.1103/PhysRevC.82.015208
10.1103/PhysRevD.58.096004
10.1103/PhysRevX.4.021013
10.1103/PhysRevLett.97.021801
10.1038/s41598-019-55300-w

Known issues with the CrossRef metadata[edit]

  • Articles published by the Royal Society of London and American Association for the Advancement of Science are sometimes missing authors. This is missing from the underlying metadata:
10.1098/rsta.1970.0068
10.1126/science.287.5455.1024
  • Citations by World Scientific are sometimes listed in all capital letters:
10.1142/S0217979291001085
10.1142/9789814439688_0003
  • Non-basic roman characters like ä,ö,ü,é often end up garbled and have to be manually retyped. These mojibake appear in the underlying crossref metadata:
10.1007/BF01374560
10.1051/jp2:1993211
  • Mathematical and chemical formulas generally lose all their formatting, which has to be re-entered:
10.1103/PhysRevLett.75.1028
10.1107/S0567740870004375

Citing an single abstract in a collection[edit]

Occasionally, scientific journals will publish a bundle of many abstracts (often conference proceedings) in a collection. This can mean that a doi exists for the whole collection, but not for the particular abstract within the collection. For example:

A. R. Lang: Acta Crystallogr. (1957b) 10, 839.

This falls within the following collection:

"Abstracts of Papers". Acta Crystallographica. 10 (12). International Union of Crystallography (IUCr): 735–863. 1957-12-01. doi:10.1107/s0365110x57002649. ISSN 0365-110X.

A good way to get specificity and explain what the citation contains is to re-title the citation in the format "[collection name]: [article name]" and then add the author back in.

Lang, A. R. (1957-12-01). "Abstracts of Papers: Point-by.point X-ray diffraction studies of imperfections in melt-grown crystals". Acta Crystallographica. 10 (12). International Union of Crystallography (IUCr): 839. doi:10.1107/s0365110x57002649. ISSN 0365-110X.

Volume mismatch[edit]

There is a particular issue with Annalen der Physik where there are two possible volume numbering systems. In the old system, volumes were restarted for each editor, whereas the current standard is to never re-use volume numbers. [5] The new system is better, but unfortunately it can cause confusion in looking up papers. Best practice is to search by year first, and then by page. For example:

D. Hondros, “Ueber elektromagnetische Drahtwelle,” Annalen der Physik, Vol. 30, pp. 905–949, 1909.

Becomes:

Hondros, D. (1909). "Über elektromagnetische Drahtwellen". Annalen der Physik (in German). 335 (15). Wiley: 905–950. doi:10.1002/andp.19093351504. ISSN 0003-3804.

Useful fields which are not auto-generated[edit]

Despite the issues, this is usually the fastest way to get a full citation. Other things that help improve citations:

  • add |doi-access=free for open-access journals
  • add |display-authors=5 for articles with a long author list
  • add in collaboration names with the |collaboration= parameter
  • add other identifiers such as |arxiv=, |pmid=, |mr=, etc.

Working without dois[edit]

Providing good access to sources without a doi is also possible, but can be more difficult. When the article or abstract is available from the publisher in English, it is best to link directly to that. However, in other cases it is necessary to look further. As an example, consider the citation: (found here)

Belov, N. V., Neronova, N. N., and Smirnova, T. S. (1955). “1651 Shubnikov groups,” (In Russian). Trudy Inst. Kristallogr. Acad. SSSR 11, 33–67.

The first step would be to find the journal name. In general, for a foreign language journal, there are a variety of possible names. Possibilities include: the journal title in Cyrillic characters, the journal title transliterated into Latin characters, the title transliterated into Latin characters and abbreviated, and the title translated into English. It's also important to find out if the reference refers to the original paper or a translations of the journal, because (as with Journal of Experimental and Theoretical Physics) the translations are sometimes published with page numbers different from the articles they are translated from.

In this case, the journal is an abbreviation of the transliterated Russian title. Unfortunately, the journal is not indexed in English anywhere online. However, in this case, an English language abstract is in fact available online at the Central Intelligence Agency's reading room, based on a Freedom of Information Act request concerning one of the authors [6] (page 43-44).

In general, there are no comprehensive databases for such sources, and it comes down to web searches, physical libraries, or knowledge about where to find related material. As stated before, a good starting point is the Internet Archive or HathiTrust are often useful for access to old titles that may be hard to find elsewhere.