Template:Did you know nominations/Trove (website)

From Wikipedia, the free encyclopedia
The following is an archived discussion of the DYK nomination of the article below. Please do not modify this page. Subsequent comments should be made on the appropriate discussion page (such as this nomination's talk page, the article's talk page or Wikipedia talk:Did you know), unless there is consensus to re-open the discussion at this page. No further edits should be made to this page.

The result was: promoted by 97198 (talk) 10:14, 18 December 2014 (UTC)

Trove[edit]

  • ... that thousands of volunteers have corrected millions of lines of digitised Australian newspapers on Trove?

Created by SatuSuro (talk), Wittylama (talk), Aliaretiree (talk), Whiteghost.ink (talk). Nominated by Wittylama (talk) at 20:05, 7 December 2014 (UTC).

  • Note: article has been moved from Trove (website) to Trove – I have adjusted the hook and credit templates as such. 97198 (talk) 03:01, 8 December 2014 (UTC)
  • New enough, long enough, and free of most policy issues (the article is neutral and mostly well cited). One issue, though, is that there are a few cases of uncited content in the article. These include, so far as I can discern, most of the "zones" listed in the "Content" section, the last two sentences of the "Books" subsection, a few sentences in the "Australian Newspapers Digitisation Project" subsection, the sentence about metadata in the "Implementation" section, and a couple sentences in the "API" subsection. There are also a couple of cases of overly close paraphrasing, in my opinion: [1], [2] (the second "hit" only, the first is a properly attributed quote), and [3]. Furthermore, some of the references are offline sources to which I do not have access, so I am accepting these citations in good faith (although I was at least able to verify the existence of three books and a journal via ISBNs/ISSNs). The hook is short enough, neutral, interesting, and cited in the article (with an offline reference that I am accepting in good faith). Also, QPQ has been done. All things considered, this article looks pretty good. I'll be more than happy to give it my stamp of approval once the aforementioned uncited content issues are addressed and the couple of instances of close paraphrasing are rectified. Michael Barera (talk) 23:23, 13 December 2014 (UTC)
  • Michael Barera -That's certainly a thorough review of the article... I think you've set quite a high bar there for the critique of close paraphrasing since most of the things these searches flagged are names (e.g. "hosted by the national library of australia", "state library of new south wales" and "register of Australian archives and manuscripts" - so there is no other way to write that. I've made a few changes the the sentences which genuinely are paraphrased - I hope this is sufficient (diff). As for the fact in the hook, there is also an online source for that fact (number 26 - Drake) but that is a blogpost by a trove employee so it's not "independent'. Nevertheless, it's a good backup with links to the live 'hall of fame' of text correctors. I hope that's sufficient for your needs, Wittylama 16:12, 15 December 2014 (UTC)
  • Thanks so much for the reply, Wittylama. The close paraphrasing looks to be completely resolved, at least to my eye, and to be clear I am accepting the hook reference in good faith, not questioning the citation in any way. The only outstanding issue appears to be the remaining uncited content. Forgive me if I seem paranoid about it, but I've recently made the error of not being strict enough about the referencing of a list section in another DYK nomination. So, for your convenience, I will list my original instances of uncited content below along with their current citation status (checkY for properly cited, ☒N for still lacking a citation):
    • checkY Most of the "zones" listed in the "Content" section (everything but "Music, sound and videos" and "Pictures")
For this point, I dispute that these zones need individual footnotes - This is a DYK review, not a GA review after all. The fact of these "zones" existence is verifiable in as much as they appear in large boxes on the front page of the website itself - see the screenshot in the infobox for example. As an overarching reference, I'll add a footnote to the "finding things" Trove Help page - which describes the different areas of content. Wittylama 16:38, 16 December 2014 (UTC)
I'd highly recommend that you ask for Yoninah's opinion on citing this section. I'm erring on the side of caution here, and I'll stand down if both of you agree that I'm wrong. Michael Barera (talk) 19:04, 16 December 2014 (UTC)
 Question: The overarching reference is an improvement, but is there any source that can be found that substantiates the whole list of "zones" that is given in the article? That would be the best solution, in my opinion. Once this issue is resolved, I'll be happy to give my support to the nomination. (As you can see from my comments below, all of the other issues I raised previously look to be resolved now.) Michael Barera (talk) 19:34, 17 December 2014 (UTC)
  • @Michael Barera: I saw my name on this template discussion. The page creator has adequately sourced the entire list with footnote 10. For some reason, the URL is directing you to Finding Things, but if you click on the "more than 370 million resources" link in the first line, it will take you to the proper webpage.
  • For a list such as this, which appears in one source, there is no need for individual line cites. We are also not concerned with the additional information on several lines that does not appear in the source, since DYK only requires one cite per paragraph, and this list is considered one paragraph. The example I sent you under Chaverim#Activities is different – it's a cobbled-together list that does not appear in one source, so each line needs substantiation. Best, Yoninah (talk) 19:55, 17 December 2014 (UTC)
Sorry, Yoninah, I missed the deceptive redirecting issue. Upon clicking through to the Current work counts by zone page, the proper support for the whole list can indeed be found. Sorry for the bother. This resolves my last issue with the nomination, and I am now giving my support. Good to go! Michael Barera (talk) 20:05, 17 December 2014 (UTC)
    • checkY The last two sentences of the "Books" subsection ("The results can be filtered by format if searching for braille, audio books, theses or conference proceedings and also by decade and language of publication. A filter for Australian content is also provided.")
With the help of Aliaretiree, this section now has three footnotes - to a conference paper, to an industry journal and to trove's own help documentation. Wittylama 23:21, 16 December 2014 (UTC)
 Done The "Books" subsection looks good to me now. Michael Barera (talk) 19:34, 17 December 2014 (UTC)
    • checkY A few sentences in the "Australian Newspapers Digitisation Project" subsection ("The website was the public face of the Australian Newspapers Digitisation Project (ANDP), a coordination of major libraries in Australia to convert historic newspapers to text-searchable digital files. The Australian Newspapers website allowed users to search the database of digitised newspapers from 1803 to 1954 which are now in the public domain. The newspapers (frequently microfiche or other photographic facsimiles) were scanned and the text from the articles has been captured by optical character recognition (OCR) to facilitate easy searching, but it still contains many OCR errors, often due to poor quality facsimiles. The system therefore incorporated crowdsourced text-correction as a major feature, allowing the public to change the machine-readable text.")
I've added two footnotes to this part - one to a 2008 article in a mainstream national newspaper and the other to a blog relevant to (one of) trove's target audiences - genealogists. Wittylama 23:51, 16 December 2014 (UTC)
 Done The "Australian Newspapers Digitisation Project" subsection looks good to me now. Michael Barera (talk) 19:34, 17 December 2014 (UTC)
    • checkY The sentence about metadata in the "Implementation" section (which has since been expanded to two uncited sentences: "With the notable exception of the newspaper "zone", none of the material that appears in Trove search results is hosted by Trove itself. Instead, it indexes the content of its content partners' collection metadata and displays the aggregated information in a relevance-ranked search result.")
 Done The content about metadata in the "Implementation" section looks good to me now. Michael Barera (talk) 19:34, 17 December 2014 (UTC)
    • checkY A couple sentences in the "API" subsection ("Trove provides an Application Programming Interface (API) which allows developers to search across the records for books, images, maps, video, archives, music, sound, journal articles, newspaper articles and lists and to retrieve the associated metadata using XML and JSON encoding." and "The commitment to open data provides many opportunities for research and digital scholarship.")
I've added three footnotes for the API - one to an industry report which is a proof that the API exists, a second from the Trove help documentation about the API specifics and a third which is a technical blogpost. Ive removed the sentences about "digital scholarship" until some future date when it can return with several examples (I hope). Wittylama 23:34, 16 December 2014 (UTC)
 Done The "API" subsection looks good to me now. Michael Barera (talk) 19:34, 17 December 2014 (UTC)
I hope that this helps. I really did enjoy reading the article, and once these outstanding uncited statements are properly cited, I'll be more than happy to give my full support to this nomination. Take care! Michael Barera (talk) 01:36, 16 December 2014 (UTC)
In case it wasn't clear earlier in this thread, my final issue with the nomination has now been resolved, and I am giving my full support. It looks good to go! Michael Barera (talk) 20:05, 17 December 2014 (UTC)