User talk:EGAMIA-AlphaBot

Refresh Random Links
Seen at: Template:Opentask

Game Page Counter
Seen at: Template:MasterGameListCount & Template:MasterGameListCount/GameList Currently we have unique game pages.

While MediaWiki keeps a counter of how many total pages exist in the wiki, there is no way for us to track how many pages are in specific categories.

What this piece of code does:

Cache all links in Subcategories Visit each link Cache all links in "articles in this category." do a count, remove duplicates print it to a page
 * 1) Open

What this will do is search through a certain category (in the case, the Game List) and do a deep search to find all the games that have been indexed. If a title appears in two or more category, it will only have one count. It's a pessimistic algorithm that will undershoot the actual number of game titles (IE1. Donkey Konga, while having two different versions, will only be counted as one. IE2. If two titles are different, but have the same name, it will be counted as one, such as Sports titles that appear on handhelds compared to home console version).

Discussion about the bot
This might be about general bots use, but, could we have a bot that can collect all the articles that link to any days and add the internal links to each relavant day article if it doesn't exist. For example: Game XYZ was released on November 1, 2000 but has yet to be added to the November 1 article. The bot does a search and finds that Game XYZ links to November 1, 2000, excludes it if already entered and moves on, and puts an entry on the article Nov 21 at the end of the page (maybe under a new section titled "entries to be indexed") that follows the text below
 * 1) *Game XYZ release year
 * Game XYZ was released for System XYZ in Area XYZ

Could this be done? Questions I have are:
 * Can a bot be set to recognize the field in the infobox that has the release date only, not all internal date links?
 * Can the bot seperate the release day and year and enter them in the example text above along with system it was released on and region?
 * Can the bot exclude articles already indexed?
 * If the article has yet to be created can it create it and enter in at the top to add the calender and Category:Days at the end to cat the page?
 * Do we have a good standardization of the date field for this to work?

If we had this we could really cut down on current work with the date pages and easily rerun the bot when a bunch of games have been added. I'm not sure of the coding involved since I only have very basic knowledge so if this is just too much work tell me to go away. I may be able to help but I'd have to get some reading going first. Maybe even create my own bot to experiment with if that's possible. --Porphyria Plan 11:31, 1 March 2006 (CST)


 * It's something that can be done, and it makes sense to do, but there are a few problems given our current implementation of the wiki (without a SQL Database dump):


 * The ideal implementation is to grab the SQL dump, scan for for potential links, and create/update that specific date.
 * The problem with that right now is that we don't have the SQL dump (not that I know of), which means the only other way is my hack and slash way with the current Bot. => Download every page required from the web, and extract the information. While downloading the page every so often for ~10 pages isn't that bad, doing it for 365 pages will be hell for the server (something Achernar would probably kill me for.)  I can make the thread wait, which lightens the load on the server, but at the expense of a long program stall on the person who's running it.


 * To answer your questions:
 * Yes, all you would be doing is read the data on the page, so you know what to expect. You can run a script so that you look for the field entry in the infobox.
 * Yes, the bot can seperate the release day and year and enter them in the example text. Including the system and the region, however, would take a much more code to do checking.  In addition, in order to reduce the number of hits on the server, either the bot will have to store the information in a structure (hence, a SQL dump would come in handy)
 * Technically no, but because the bot will be crawling through all the pages again, it would not improve efficiency or runtime anways. In addition, what if a page get's removed?  You are better off rewriting everything from scratch, it makes no difference on a page update.
 * Yes, it can create non-existing pages, but I would warn against that. Creating date pages by script is fine, because it's bounded by # of dates.  Creating by game title?  That's not bounded, and a potential entry for spam.


 * Do we have a good standardization of the date field for this to work? - Preferred. You can get some leeway.


 * The problem I'm faced with in regards to coding is this: Typical bot programming on Wikipedia is python or perl based.  Mediawiki itself recommends Python, but the split in bots are around 50-50.  However, I know nothing about Python, and the bare basics for Perl, not including the fact that I would have no clue how to launch those scripts, therefore, I went looking for a Java package that did the trick (as you may have seen the "information" posted by the bot itself).  It's a memory hog, it's a long program that can be dupliced by much less lines of code in the other languages, but it allows me to build a user interface, and it's platform independent.--AlphaTwo 13:41, 1 March 2006 (CST)

History
Well, it's back up now. --AlphaTwo 19:54, 18 February 2007 (PST)
 * And back down again. :( --AlphaTwo 17:05, 27 February 2007 (PST)

Media 1.9.2's bricking of AlphaBot
The package I have been using, MediawikiConnector, has not been revised since 2005, back then it supported MediaWiki 1.4, with some bugs for 1.5. Unfortunately, we're now at 1.9.2, and it's causing major headaches. Basically, the setPage function (the one that writes a page) is down.

Technical details about it
Basically, when a post is made, a HTTP POST is sent, with a 302 (redirect code), then followed by a 200 (everything went fine). The WikiConnector skips the 302 part, and is refusing to write to a page, which means something is not sent in to POST properly. Here's the trace of the different POST headers.


 * 1) From Web:

POST /w2/index.php?title=TestPage&action=submit HTTP/1.1 Host: egamia.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.10) Gecko/20070216 Firefox/1.5.0.10 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://egamia.com/w2/index.php?title=TestPage&action=edit Cookie: mediawiki_utf8Token=6e0fd7e6c940aa7b4a500a96f3775a0e; mediawiki_utf8UserID=3; mediawiki_utf8UserName=AlphaTwo; mediawiki_utf8_session=me18a19q3rgrkoao90ig54dqn2

From HTTPClient:

POST /w2/index.php?title=TestPage&action=submit HTTP/1.1 User-Agent: Jakarta Commons-HttpClient/3.0-rc2 Host: egamia.com Cookie: $Version=0; mediawiki_utf8_session=vgif66fm1h8rit9ukdrf1rgc83; $Path=/ Cookie: $Version=0; mediawiki_utf8UserID=6; $Path=/ Cookie: $Version=0; mediawiki_utf8UserName=AlphaBot; $Path=/ Cookie: $Version=0; mediawiki_utf8Token=3aad2b7e8a7659162b378dab705dc83f; $Path=/ Content-Length: 96 Content-Type: application/x-www-form-urlencoded

wpTextbox1=This+is+a+MediaWikiConnection+test.&wpEdittime=20070228003726&wpEditToken=&wpSummary=

If you have any idea of what the problem could be, leaving a note here or with me would be great.
 * Ok. Cleared one thing: I was not getting the EditToken back.  I've read up more about it, and it seems that mediaWiki has moved into cookie authentication, which is probably causing the problem.--AlphaTwo 18:09, 27 February 2007 (PST)


 * Hi there! I am also using mediawikiconection and I have the same problem. Have you find any alternative java (or not java) code able to perfom cookie authentication? Kokomo 02:46, 24 March 2007 (PDT)

Counting Games
Instead of listing all the games on a page can the bot instead count how many pages has the infobox template attach? I think it will be faster since that only list video games. I may make a template for board games and card games soon. Maybe just counting the pages with the template on it would be faster --Cs california 00:53, 27 April 2007 (PDT)