User:XaxaBot

From Wikipedia, the free encyclopedia

This is a robot account, currently unauthorized. But that's okay because it doesn't do anything. Leave a comment about this bot at the owner's talk page. Would you like to read the (dis)approval discussion?

In summary, this bot doesn't edit anything (at this point). But I should know what my download limitations are. The framework seems to have a throttle built in, and I have no desire to override it (I'll just let it run over night if it has to spend hours getting 100 pages). Immediately, I want to begin writing code to parse Centuries, Decades, and List of years, and then parse the lists of events in the dozens and dozens of articles linked therefrom, and save it to a local file .

Per the instructions listed at WP:BRFA:

  1. The current purpose is to gather a list of events from the Centuries, Decades, and List of years articles, process them, and save to a local text file. This bot is written in Python, using the WikipediaBot framework.
  2. I (Xaxafrad) will be running this bot only on a manual basis.
  3. Periodicity will be limited: after intially retrieving the articles, I will only need to check for changes on a monthly (or quarterly, or annual) basis.
  4. Xaxafrad is the author/maintainer.

Features[edit]

I'd like to use this bot to crawl over the Centuries and Decades articles, downloading the events and compiling them in some way, and saving a local file. I haven't yet started writing the code, but I'll keep the appropriate persons updated regarding my progress. For now, I need to work out the logic whereby it will crawl all the pages I want. I'm aware crawling isn't the preferred method of downloading articles, so I'll need to download a few pages before getting a list of articles to feed to the Special:Export.