Skip to content

mediawiki-client-tools/mediawiki-dump-generator

 
 

Repository files navigation

MediaWiki Dump Generator

MediaWiki Dump Generator can archive wikis from the largest to the tiniest.

MediaWiki Dump Generator is a project to port the legacy wikiteam toolset to Python 3 and PyPI to make it more accessible for today's archivers.

Most of the focus has been on the core dumpgenerator tool. Python 3 versions of the other wikiteam tools may be added over time.

The project is currently mostly in maintenance mode. We will do our best to prevent the project from breaking entirely. Issues and pull requests are welcomed but may not be reviewed promptly.

MediaWiki Dump Generator Toolset

MediaWiki Dump Generator is a set of tools for archiving wikis. The main general-purpose module of MediaWiki Dump Generator is dumpgenerator, which can download XML dumps of MediaWiki sites that can then be parsed or redeployed elsewhere.

Wikipedia is far too large to manage the dump easily and dumps are already freely available.

Installing the tools

For prerequisites and installation see Installation

Using the tools

For usage see Usage

Publishing the dump

Please consider publishing your wiki dump(s). You can do it yourself as explained in Publishing.

Getting help

  • You can read and post in MediaWiki Client Tools' GitHub Discussions.
  • If you need help (other than reporting a bug), you can reach out on MediaWiki Client Tools' Discussions/Q&A.

Contributing

For information on reporting bugs and proposing changes, please see the Contributing guide.

Code of Conduct

mediawiki-client-tools has a Code of Conduct.

At the moment the only person responsible for reviewing CoC reports is the repository administrator, Janet Cobb, reachable at [email protected]. Please state up front if your message concerns the Code of Conduct, as these messages are confidential.

In case of emergency (i.e. if Janet is not reachable or if such an issue involves her), you can contact Elsie Hupp, who also retains privileges over this repository, directly via email at [email protected] or on Matrix at @elsiehupp:beeper.com.

Contributors

WikiTeam is the Archive Team [GitHub] subcommittee on wikis. It was founded and originally developed by Emilio J. Rodríguez-Posada, a Wikipedia veteran editor and amateur archivist. Thanks to people who have helped, especially to: Federico Leva, Alex Buie, Scott Boyd, Hydriz, Platonides, Ian McEwen, Mike Dupont, balr0g and PiRSquared17.

MediaWiki Dump Generator The Python 3 initiative was started and originally maintained by Elsie Hupp; it is currently primarily maintained by Janet Cobb. We are also grateful to have contributions from Victor Gambier, Thomas Karcher, yzqzss, NyaMisty and Rob Kam.

About

Python 3 tools for downloading and preserving wikis

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages

  • HTML 75.1%
  • Python 24.9%