User:CbmBOT

From Wikipedia, the free encyclopedia

Bot functions[edit]

  • Updates the table under Number of Articles Remaining in Category:Cleanup by month. In order to do so, the bot follows 3 guidelines:
    1. The page count from each month is the sum of the pages needing cleanup and the music pages needing cleanup for that month. For example, for July 2005, the categories Category:Cleanup from July 2005 and Category:Music cleanup from July 2005 are taken into account.
    2. Articles in subcategories are not counted twice.
    3. Pages listed that are of the form Wikipedia:Cleanup/<MONTH> (such as Wikipedia:Cleanup/June) are ignored for counting purposes, as they are not truly in need of cleanup, but rather information pages about what needs cleanup.

Bot internals[edit]

  • The bot starts at Category:Cleanup by month and collects the categories (listed under the Subcategories section on that page), named "Cleanup from {MONTH} {YEAR}", that contain pages needing cleanup.
  • Each category page is inspected, and the number of pages in that category is calculated:
    • The bot looks for the string "There are ## pages in this section of this category." at the top of the "Pages in category..." section on each category page, and keeps track of that number.
    • The bot will follow "(next 200)" links on category pages in order to get the complete count for the category.
  • The bot repeats the previous process, using the subcategories on Category:Music cleanup by month.
  • The bot will immediately abort if a count of 0 is returned for any category (as this is an impossibility and means that the bot had trouble parsing a page, or, more likely, timed out while trying to do so).
  • If the bot successfully retrieved information from each category, it will pull the total number of articles from Special:Statistics.
  • The bot will then format the information gleaned into wikicode, and update the section.
  • The bot keeps track of the elapsed time and number of pages processed. On average, a successful run takes about three minutes, and processes less than one hundred pages.

Bot description[edit]

  • This bot is a PHP 5.1.4 script that runs on Unix and uses cURL and regular expression parsing.
  • The bot is manually run, though there's no real reason not to have it run in a cron job once approved.
  • The bot needs to run only once a day, and it can be relegated to running during off-peak hours.
  • The bot will not update the article if any errors are detected.
  • The current (running) version of this bot is 2.0.9, updated 2007-02-20.
  • Maintainer is User:Dvandersluis.