User:HersfoldCiteBot/Trial run logs

From Wikipedia, the free encyclopedia

This page will contain the bot's operation logs during its BAG-approved trial runs. These logs are normally saved locally on my computer and are not displayed on the wiki unless needed.

These trials are conducted within the Eclipse IDE, in "Debug" mode. If I notice a significant problem during the run, I will pause the program to check what the cause of the problem may be, and may terminate the run manually if necessary.

Trial 1 - Failed[edit]

This run was terminated early when the bot failed to notice any cite web errors in a number of articles that do contain them. On suspending the program, I noticed an error with the "displaytext" the bot receives to look for error messages. This display text is the HTML code we see when viewing the site with an internet browser. Unfortunately, the bot is not getting the entire page this way; it truncates well before it reaches the references section, which means the bot doesn't notice any errors and moves on to the next article. Obviously this needs to be fixed for the bot to do any good.

Never mind... on further investigation, it is an error with my bot's code. When testing on test.wikipedia.org, I copied the error messages in plain-text. Here, the error messages are live, and so appear in HTML code to the bot's eyes. I'm going to try and update the search strings and that should fix the problem.

------------------------------------------------------
HersfoldCiteBot Operation Log
Running version 1.1.2b
September 23 2010, 23:01:15 UTC
------------------------------------------------------

This is a trial run; the bot will make 30 edits, then stop.

23:01:15 - Attempting login...

23:01:17 - Successfully logged in as HersfoldCiteBot on en.wp.

23:01:19 - Getting articles in Category:Articles with broken citations
23:01:20 - Processing 'O Sole Mio
23:03:42 - No {{cite web}} errors found in this article.

23:03:42 - Processing 2010 Women's Rugby World Cup squads
23:03:49 - No {{cite web}} errors found in this article.

23:03:49 - Processing 5 Centimeters Per Second
23:03:54 - No {{cite web}} errors found in this article.

23:03:54 - Processing AIESEC
23:03:57 - No {{cite web}} errors found in this article.

23:03:57 - Processing ASCII art
23:04:01 - No {{cite web}} errors found in this article.

23:04:01 - Processing Aaliyah (album)
23:04:05 - No {{cite web}} errors found in this article.

23:04:05 - Processing Alan Ritchson
23:04:06 - No {{cite web}} errors found in this article.

23:04:06 - Processing Alex Timbers
23:04:07 - No {{cite web}} errors found in this article.

23:04:07 - Processing All Our Kings Are Dead
23:04:08 - No {{cite web}} errors found in this article.

23:04:08 - Processing Amelia Reynolds (television presenter)
23:04:09 - No {{cite web}} errors found in this article.

23:04:09 - Processing American Idiot (song)
23:04:11 - No {{cite web}} errors found in this article.

23:04:11 - Processing American Slang
23:04:13 - No {{cite web}} errors found in this article.

23:04:13 - Processing Amy Irving
23:04:14 - No {{cite web}} errors found in this article.

23:04:14 - Processing Auburn University
23:04:17 - No {{cite web}} errors found in this article.

23:04:17 - Processing Back It Up (song)
23:04:19 - No {{cite web}} errors found in this article.

23:04:19 - Processing Bagri (clan)
23:04:21 - No {{cite web}} errors found in this article.

23:04:21 - Processing Bandera, Texas
23:04:24 - No {{cite web}} errors found in this article.

23:04:24 - Processing Bay Village, Ohio
23:04:25 - Logging out and shutting down.

Trial 2 - Failed[edit]

While the problems from the previous trial were fixed, there appear to be some more errors in actually correcting the templates. I have reverted all the bot's edits, but I'll need to look through the log and code to determine the exact problems and causes.



------------------------------------------------------
HersfoldCiteBot Operation Log
Running version 1.1.3b
September 24 2010, 00:06:07 UTC
------------------------------------------------------

This is a trial run; the bot will make 10 edits, then stop.

00:06:07 - Attempting login...

00:06:08 - Successfully logged in as HersfoldCiteBot on en.wp.

00:06:10 - Getting articles in Category:Articles with broken citations
00:06:10 - Processing 'O Sole Mio
00:06:26 - Possible fixable errors found, attempting corrections
00:06:26 - Getting text of 'O Sole Mio
00:06:26 - Trying to add a title= parameter to {{citeweb|http://www.thisissuttoncoldfield.co.uk/news/Tributes-Renato-8211-singing-sensation/article-1237758-detail/article.html|author=J Newton|publisher=This Is Sutton Coldfield|date=2009-08-10|accessdate=2010-09-12}}
00:06:27 - Saving changes to 'O Sole Mio

00:06:37 - Processing 2010 Women's Rugby World Cup squads
00:06:43 - No {{cite web}} errors found in this article.

00:06:43 - Processing 5 Centimeters Per Second
00:06:49 - No {{cite web}} errors found in this article.

00:06:49 - Processing AIESEC
00:06:51 - No {{cite web}} errors found in this article.

00:06:51 - Processing ASCII art
00:06:57 - Possible fixable errors found, attempting corrections
00:06:57 - Getting text of ASCII art
00:06:58 - Trying to add a title= parameter to {{cite web
   |first=Simon
   |last=Jansen
   |date=April 18, 2006
   |url=http://www.asciimation.co.nz/
   |title=Star "ASCIImation" Wars
   |publisher=Asciimation.co.nz
   |accessdate=2008-11-18
}}
00:07:05 - IOException recieved when trying to access http://Simon .
00:07:05 - Saving changes to ASCII art

00:07:22 - Processing Aaliyah (album)
00:07:26 - No {{cite web}} errors found in this article.

00:07:26 - Processing Alan Ritchson
00:07:27 - No {{cite web}} errors found in this article.

00:07:27 - Processing Alex Timbers
00:07:28 - No {{cite web}} errors found in this article.

00:07:28 - Processing All Our Kings Are Dead
00:07:30 - No {{cite web}} errors found in this article.

00:07:30 - Processing Amelia Reynolds (television presenter)
00:07:31 - No {{cite web}} errors found in this article.

00:07:31 - Processing American Idiot (song)
00:07:33 - No {{cite web}} errors found in this article.

00:07:33 - Processing American Slang
00:07:35 - No {{cite web}} errors found in this article.

00:07:35 - Processing Amy Irving
00:07:40 - No {{cite web}} errors found in this article.

00:07:40 - Processing Auburn University
00:07:44 - No {{cite web}} errors found in this article.

00:07:44 - Processing Back It Up (song)
00:07:46 - No {{cite web}} errors found in this article.

00:07:46 - Processing Bagri (clan)
00:07:49 - No {{cite web}} errors found in this article.

00:07:49 - Processing Bandera, Texas
00:07:51 - No {{cite web}} errors found in this article.

00:07:51 - Processing Bay Village, Ohio
00:07:54 - No {{cite web}} errors found in this article.

00:07:54 - Processing Belgian nationality law
00:07:55 - No {{cite web}} errors found in this article.

00:07:55 - Processing Ben Affleck
00:08:02 - No {{cite web}} errors found in this article.

00:08:02 - Processing Big Four (audit firms)
00:08:04 - No {{cite web}} errors found in this article.

00:08:04 - Processing Blake LeVine
00:08:05 - No {{cite web}} errors found in this article.

00:08:05 - Processing Blindcrake
00:08:14 - No {{cite web}} errors found in this article.

00:08:14 - Processing Boston University
00:08:22 - No {{cite web}} errors found in this article.

00:08:22 - Processing Britney's New Look
00:08:24 - No {{cite web}} errors found in this article.

00:08:24 - Processing C++
00:08:29 - No {{cite web}} errors found in this article.

00:08:29 - Processing Cafe Antarsia Ensemble
00:08:32 - No {{cite web}} errors found in this article.

00:08:32 - Processing California Polytechnic State University
00:08:38 - No {{cite web}} errors found in this article.

00:08:38 - Processing Canada Bank Act
00:08:40 - Possible fixable errors found, attempting corrections
00:08:42 - Getting text of Canada Bank Act
00:08:43 - Trying to add a title= parameter to {{cite web|url=http://faculty.marianopolis.edu/c.belanger/quebechistory/encyclopedia/BankinginCanada-CanadianBanks-CanadianHistory.htm=[[The Quebec History Encyclopedia
]]}}
00:08:45 - IOException recieved when trying to access http://faculty.marianopolis.edu/c.belanger/quebechistory/encyclopedia/BankinginCanada-CanadianBanks-CanadianHistory.htm=[[The .
00:08:45 - Saving changes to Canada Bank Act

00:08:55 - Processing Candice Bergen
00:08:58 - No {{cite web}} errors found in this article.

00:08:58 - Processing Cecile B. Kremer
00:09:00 - Possible fixable errors found, attempting corrections
00:09:01 - Getting text of Cecile B. Kremer
00:09:01 - Trying to add a title= parameter to {{cite web|url=}}
00:09:26 - Logging out and shutting down.

Problems noted[edit]

  • Failed to identify correct situation in 'O Sole Mio, attempted to add author's name as an existing title. diff. Likely cause is that the template did not in fact have a named |url= parameter and the next space happened to be in the guy's name.
    • If url= cannot be found, search for http://. If that can't be found, give up and flag for review -  Done in 1.1.4b
    • If url= cannot be found but http:// can, add url= where needed -  Done in 1.1.4b
    • When searching for an existing title, don't keep searching beyond the end of the parameter (stop at the pipe or end brackets) - Already does this, the apparent misbehavior was the result of other misbehavior, now fixed.
  • Lots of issues with ASCII art - first, the article was identified incorrectly as having a fixable error. Secondly, it tried to add a title where a title existed already. Thirdly, it grabbed the wrong parameter for the URL. Fourthly, it saved an edit converting &lt; and &gt; code to < and > symbols.
    • Fix the search string for the "archiveurl missing archivedate" error; that's what caused the misidentification. -  Done in 1.1.4b
    • Need to figure out why the template was flagged for not having a title parameter, although maybe having the template on multiple lines caused it? -  Done in 1.1.4b, apparently regexes in Java don't match newlines on the . character. Odd.
    • Issue #3 is likely the same problem as #2, thus needing a similar fix -  Done per above
    • Not sure issue #4 can be fixed; the bot framework decodes all that itself, and in some cases those corrections may be necessary. - Red X Won't fix
  • The problem with Canada Bank Act is to be expected; here the template is malformed, and the bot won't be able to recognize that the URL has in fact ended, as equal signs are commonly found in URLs. Nothing to be fixed there, the page would have been flagged for review.
  • The bot crashed when trying to access the blank URL at Cecile B. Kremer
    • I need to add code to the bot to tell it to flag blank url arguments for attention and not try to connect to them. -  Done in 1.1.4b

Trial 3 - Failed[edit]

On this run, the bot correctly made two edits, however made one incorrect edit at Canada Bank Act and reported that it had edited a fourth page, however no edit was actually made... so I'm not really sure what happened there. The bot terminated its run early as a result of a runtime error while processing Fowey. The bot's log is provided below.


------------------------------------------------------
HersfoldCiteBot Operation Log
Running version 1.1.4b
September 29 2010, 00:17:02 UTC
------------------------------------------------------

This is a trial run; the bot will make 10 edits, then stop.

00:17:02 - Attempting login...

00:17:02 - Successfully logged in as HersfoldCiteBot on en.wp.

00:17:04 - Getting articles in Category:Articles with broken citations
00:17:04 - Processing 'O Sole Mio
00:17:05 - Possible fixable errors found, attempting corrections
00:17:05 - Getting text of 'O Sole Mio
00:17:05 - Trying to add a title= parameter to {{citeweb|http://www.thisissuttoncoldfield.co.uk/news/Tributes-Renato-8211-singing-sensation/article-1237758-detail/article.html|author=J Newton|publisher=This Is Sutton Coldfield|date=2009-08-10|accessdate=2010-09-12}}
00:17:06 - Saving changes to 'O Sole Mio

00:17:16 - Processing 2010 Women's Rugby World Cup squads
00:17:22 - No {{cite web}} errors found in this article.

00:17:22 - Processing 5 Centimeters Per Second
00:17:27 - No {{cite web}} errors found in this article.

00:17:27 - Processing AIESEC
00:17:29 - No {{cite web}} errors found in this article.

00:17:29 - Processing ASCII art
00:17:32 - No {{cite web}} errors found in this article.

00:17:32 - Processing Aaliyah (album)
00:17:37 - No {{cite web}} errors found in this article.

00:17:37 - Processing Aberdeen
00:17:49 - No {{cite web}} errors found in this article.

00:17:49 - Processing Alan Ritchson
00:17:50 - No {{cite web}} errors found in this article.

00:17:50 - Processing Alex Timbers
00:17:51 - No {{cite web}} errors found in this article.

00:17:51 - Processing All Our Kings Are Dead
00:17:53 - No {{cite web}} errors found in this article.

00:17:53 - Processing American Idiot (song)
00:17:54 - No {{cite web}} errors found in this article.

00:17:54 - Processing American Slang
00:17:56 - No {{cite web}} errors found in this article.

00:17:56 - Processing Amy Irving
00:17:57 - No {{cite web}} errors found in this article.

00:17:57 - Processing Ann-Margret
00:18:01 - No {{cite web}} errors found in this article.

00:18:01 - Processing Aranya
00:18:02 - No {{cite web}} errors found in this article.

00:18:02 - Processing Arts in Rome
00:18:03 - No {{cite web}} errors found in this article.

00:18:03 - Processing Auburn University
00:18:08 - No {{cite web}} errors found in this article.

00:18:08 - Processing Axe (grooming product)
00:18:12 - No {{cite web}} errors found in this article.

00:18:12 - Processing Bagri (clan)
00:18:14 - No {{cite web}} errors found in this article.

00:18:14 - Processing Bandera, Texas
00:18:17 - No {{cite web}} errors found in this article.

00:18:17 - Processing Basel
00:18:27 - No {{cite web}} errors found in this article.

00:18:27 - Processing Bay Village, Ohio
00:18:29 - No {{cite web}} errors found in this article.

00:18:29 - Processing Belgian nationality law
00:18:30 - No {{cite web}} errors found in this article.

00:18:30 - Processing Big Four (audit firms)
00:18:32 - No {{cite web}} errors found in this article.

00:18:32 - Processing Blake Harrison
00:18:33 - No {{cite web}} errors found in this article.

00:18:33 - Processing Blake LeVine
00:18:34 - No {{cite web}} errors found in this article.

00:18:34 - Processing Blindcrake
00:18:41 - No {{cite web}} errors found in this article.

00:18:41 - Processing Bloggingheads.tv
00:18:45 - No {{cite web}} errors found in this article.

00:18:45 - Processing Borders Group
00:18:49 - No {{cite web}} errors found in this article.

00:18:49 - Processing Britney's New Look
00:18:50 - No {{cite web}} errors found in this article.

00:18:50 - Processing Bruce Van Voorhis
00:18:51 - No {{cite web}} errors found in this article.

00:18:51 - Processing C++
00:18:56 - No {{cite web}} errors found in this article.

00:18:56 - Processing Cafe Antarsia Ensemble
00:18:56 - No {{cite web}} errors found in this article.

00:18:56 - Processing California Polytechnic State University
00:19:04 - No {{cite web}} errors found in this article.

00:19:04 - Processing Canada Bank Act
00:19:05 - Possible fixable errors found, attempting corrections
00:19:05 - Getting text of Canada Bank Act
00:19:05 - Trying to add a title= parameter to {{cite web|url=http://faculty.marianopolis.edu/c.belanger/quebechistory/encyclopedia/BankinginCanada-CanadianBanks-CanadianHistory.htm=[[The Quebec History Encyclopedia
]]}}
00:19:05 - Saving changes to Canada Bank Act

00:19:15 - Processing Candice Bergen
00:19:17 - No {{cite web}} errors found in this article.

00:19:17 - Processing Card counting
00:19:19 - No {{cite web}} errors found in this article.

00:19:19 - Processing Cassin Young
00:19:19 - No {{cite web}} errors found in this article.

00:19:19 - Processing Chew Magna
00:19:26 - Possible fixable errors found, attempting corrections
00:19:26 - Getting text of Chew Magna
00:19:26 - Trying to add a title= parameter to {{cite web |url=http:www.singstargame.com/en-gb/}}
00:19:26 - IOException recieved when trying to access http://http:www.singstargame.com/en-gb/ .
00:19:26 - Saving changes to Chew Magna

00:19:39 - Processing Child benefit
00:19:40 - No {{cite web}} errors found in this article.

00:19:40 - Processing Chilean Army
00:19:42 - No {{cite web}} errors found in this article.

00:19:42 - Processing Cinema of Nigeria
00:19:44 - No {{cite web}} errors found in this article.

00:19:44 - Processing Cleon Skousen
00:19:50 - No {{cite web}} errors found in this article.

00:19:50 - Processing Coconut cake
00:19:50 - No {{cite web}} errors found in this article.

00:19:50 - Processing Cornetto (ice cream)
00:19:51 - Possible fixable errors found, attempting corrections
00:19:51 - Getting text of Cornetto (ice cream)
00:19:51 - Trying to add a title= parameter to {{citeweb|http://www.thisissuttoncoldfield.co.uk/news/Tributes-Renato-8211-singing-sensation/article-1237758-detail/article.html|author=J Newton|publisher=This Is Sutton Coldfield|date=2009-08-10|accessdate=2010-09-12}}
00:19:52 - Saving changes to Cornetto (ice cream)

00:20:02 - Processing Dartmouth College
00:20:24 - No {{cite web}} errors found in this article.

00:20:24 - Processing David Miliband
00:20:32 - No {{cite web}} errors found in this article.

00:20:32 - Processing Davy Fresh
00:20:33 - No {{cite web}} errors found in this article.

00:20:33 - Processing Decoder Ring Theatre
00:20:34 - No {{cite web}} errors found in this article.

00:20:34 - Processing Delta Air Lines
00:20:43 - No {{cite web}} errors found in this article.

00:20:43 - Processing Demi Lovato
00:20:48 - No {{cite web}} errors found in this article.

00:20:48 - Processing Demographics of Italy
00:20:52 - No {{cite web}} errors found in this article.

00:20:52 - Processing Digital terrestrial television
00:21:08 - No {{cite web}} errors found in this article.

00:21:08 - Processing Dr Pepper
00:21:12 - No {{cite web}} errors found in this article.

00:21:12 - Processing Dubstar
00:21:13 - No {{cite web}} errors found in this article.

00:21:13 - Processing Dylan Baker
00:21:14 - No {{cite web}} errors found in this article.

00:21:14 - Processing Dynasty (TV series)
00:21:18 - No {{cite web}} errors found in this article.

00:21:18 - Processing Eagles of Death Metal
00:21:19 - No {{cite web}} errors found in this article.

00:21:19 - Processing Economy of Rome
00:21:20 - No {{cite web}} errors found in this article.

00:21:20 - Processing Edward P. Jones
00:21:21 - No {{cite web}} errors found in this article.

00:21:21 - Processing Electronic waste
00:21:25 - No {{cite web}} errors found in this article.

00:21:25 - Processing Eliza Dushku
00:21:28 - No {{cite web}} errors found in this article.

00:21:28 - Processing Emmanuelle Chriqui
00:21:29 - No {{cite web}} errors found in this article.

00:21:29 - Processing Enes Mešanović
00:21:30 - No {{cite web}} errors found in this article.

00:21:30 - Processing Esotericism
00:21:32 - No {{cite web}} errors found in this article.

00:21:32 - Processing Eye of Horus
00:21:32 - No {{cite web}} errors found in this article.

00:21:32 - Processing Fart
00:21:34 - No {{cite web}} errors found in this article.

00:21:34 - Proc

Hersfold note: The bot was apparently trying to write to the log at the time it crashed; however, I am not sure why the log is so far delayed, as "Fart" is several entries up the category from the page it crashed on, "Fowey".

Problems Noted[edit]

  • Rather than logging for manual review as should have happened, the bot attempted to insert a null title in the wrong place at Canada Bank Act (diff)
  • I'm not sure what changes were supposed to get saved to Chew Magna, but I'll need to figure that out. The IOException there is to be expected, as the URL is malformed.
  • The bot crashed due to a StringOutOfBoundsException, reporting the following:
    Exception in thread "Thread-5" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
    at java.lang.String.substring(Unknown Source)
    at citation.HersfoldCiteBot.correctCiteWebErrors(HersfoldCiteBot.java:653)
    at citation.HersfoldCiteBot.run(HersfoldCiteBot.java:216)
    at java.lang.Thread.run(Unknown Source)