The MediaWiki software can be used as a service-oriented architecture, providing services to obtain pages as htlm without skin, raw text content, to update information, get lists of categories or links on a page, etc. Creating a bot describes the use of APIs in detail, including how to create and use the edit tokens. A "bot" is a colloquial term for software that works against MediaWiki as a service.
The primary interfaces are the normal index.php (URL-parameter) API and a special API.php interfaces (returning various formats such as xml or JSON).
See the documentation parameters to index.php.
Especially useful parameters are:
- action=raw return the text as it is visible in view source or edit mode. Example: http://220.127.116.11/metawiki/index.php?title=Main_Page&action=raw
- If the page has templates, these can first be expanded by specifying action=raw&templates=expand
- action=render returns html, but without toolboxes or skin (and without CSS styling!). Example: http://18.104.22.168/metawiki/index.php?title=Main_Page&action=render
- It can also be used for saving, after obtaining an edit token.
Status of API: Deprecated for software interaction.
This is a special software API to MediaWiki allowing to login, read, and write pages in xml or json. By default, API is turned on for read and off for write. Most Wikimedia foundation wikis have write API turned on.
- http://22.214.171.124/metawiki/api.php?action=query&titles=Main_Page does not return the page content, but is useful to test existance!
- To get the content of a page, get the last revision with "content" in revisionproperties: http://126.96.36.199/metawiki/api.php?action=query&prop=revisions&titles=Main_Page&rvprop=timestamp%7Cuser%7Ccomment%7Ccontent
- To harvest any kind of data from the wiki, list the recent changes as in: http://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rcprop=title%7Csizes%7Cuser&rclimit=500
- Useful parameters are: rclimit: Maximum amount of changes to list (10 by default); rcstart: The timestamp to start listing from; rcend: The timestamp to end listing at; rcprop: Which properties to get (e.g., user, timestamp, ids, content)
- It does not seem to be possible to get recentchanges by category, but the category list sorted by last timestamp in the following topic Note: Similar to above, further testing is required to understand whether it is possible to get recent changes only for a category
- To harvest data by category, sorted by date of change descending, use, e. g., http://188.8.131.52/metawiki/api.php?action=query&list=categorymembers&cmtitle=Category:Template_documentation&cmsort=timestamp&cmdir=desc
- The above can be improved by also outputting the timestamp; this allows to update harvested data until the date of last harvesting is reached: http://184.108.40.206/metawiki/api.php?action=query&list=categorymembers&cmtitle=Category:Template_documentation&cmsort=timestamp&cmdir=desc&cmprop=title%7Ctimestamp. Note: The timestamp here appears not to represent the date of last change but the creation date of the page.
Note: the edit API does not use xml, but uploads the raw page content directly, very similar to the index.php API!
Status of API: Recommended.
On the current Key to Nature site the API is not usable due to an old version of MediaWiki running. Therefore, the examples currently run against the separate KeyToNature Test wiki.