MWoffliner is a tool for creating a local offline HTML snapshot of any online MediaWiki instance. It scrapes all articles (or a selection if specified) and creates the corresponding ZIM file. While primarily targeted for Wikimedia projects like Wikipedia and Wiktionary, MWoffliner also supports any recent MediaWiki instance (version 1.27+), though instances with custom skins or highly unusual configurations may have limitations.
Read CONTRIBUTING.md to learn more about MWoffliner development.
User help is available in the FAQ.
- Scrape with or without image thumbnails
- Scrape with or without audio/video multimedia content
- S3 cache (optional)
- Image size optimization and WebP conversion
- Scrape all articles in namespaces or title list based
- Specify additional/non-main namespaces to scrape
Run mwoffliner --help to see all available options.
- Docker (or Docker-based engine)
- amd64 architecture
The recommended way to install and run mwoffliner is using the pre-built Docker container:
docker pull ghcr.io/openzim/mwofflinerRun software locally / Build from source
-
*NIX Operating System (GNU/Linux, macOS, etc.)
-
Redis — in-memory data store
-
Node.js version 24 (we support only one single Node.js version; other versions might work or might not)
-
Libzim — C++ library for creating ZIM files (automatically downloaded on GNU/Linux & macOS)
-
Various build tools which are probably already installed on your machine:
libjpeg-dev— JPEG image processinglibglu1— OpenGL utility libraryautoconf— automatic configuration systemautomake— Makefile generatorgcc— C compiler
(These packages are for Debian/Ubuntu systems)
An online MediaWiki instance with its API available.
-
Clone the repository locally:
git clone https://github.com/openzim/mwoffliner.git && cd mwoffliner
-
Build the image:
docker build . -f docker/Dockerfile -t ghcr.io/openzim/mwoffliner
[!WARNING] Local installation requires several system dependencies (see above). Using the Docker image is strongly recommended to avoid setup issues.
Setting up MWoffliner locally for development can be tricky due to several dependencies and version requirements. Follow these steps carefully to avoid common errors.
MWoffliner requires Node.js 24 (other versions may fail).
Compatible Node 24 ranges: >=24 <24.6 or >=24.7 <25.
Check your version:
node -vIf your version does not match, use nvm to install the correct Node.js version.
MWoffliner depends on @openzim/libzim, which requires the C++ libzim library.
- On Linux/macOS, MWoffliner can download libzim automatically.
- On Windows, you must install libzim manually because there are no prebuilt binaries. See the libzim installation guide for details.
Node 24 on Windows officially supports Visual Studio 2019 (v16) or Visual Studio 2022 (v17).
Ensure C++ build tools are installed and environment variables are set correctly. See Windows Setup for node-gyp for detailed instructions.
MWoffliner uses node-gyp, which enforces strict checks for Node and compiler versions. Make sure you have:
- Proper Visual Studio version (Windows) — see Visual Studio versions
- Required C++ headers, e.g.,
zim/archive.h— see libzim documentation - Python 3.10+ (required by node-gyp; a recent version is preferred for compatibility)
-
Clear npm cache — a corrupted cache can cause cryptic install failures:
npm cache clean --force
-
Delete node_modules and reinstall — stale or partially installed dependencies are a common source of errors:
rm -rf node_modules package-lock.json npm install
-
Check that all environment variables are set — especially on Windows,
PATH,INCLUDE, andLIBmust point to the correct Visual Studio and libzim directories. Reopen your terminal after installing new tools. -
Verify Redis is running before starting MWoffliner — MWoffliner will fail immediately if it cannot connect to Redis:
redis-cli ping # expected output: PONG -
Run npm install with verbose logging to see exactly where it fails:
npm install --verbose
| Error | Cause | Solution |
|---|---|---|
| Node.js version error | Node.js version incompatible | Install Node 24 with nvm |
| Cannot find module @openzim/libzim | libzim not installed | Follow libzim installation guide; Windows users must install manually |
| node-gyp rebuild failed | Wrong Node or compiler version | Check Node.js version, Visual Studio version, Python 3.x |
| zim/archive.h not found | C++ headers missing | Install libzim system-wide, verify include paths |
[!NOTE] Even with these steps, other setup errors may occur. Using Docker is strongly recommended for a smoother experience.
npm i -g mwoffliner[!WARNING] You might need to run this command with the
sudocommand, depending on how yournpm/ OS is configured.npmpermission checking can be a bit annoying for newcomers. Please read the npm script documentation if you encounter issues.
# Get help
docker run -v $(pwd)/out:/out -ti ghcr.io/openzim/mwoffliner mwoffliner --help# Create a ZIM for https://bm.wikipedia.org
docker run -v $(pwd)/out:/out -ti ghcr.io/openzim/mwoffliner \
mwoffliner --mwUrl=https://bm.wikipedia.org --adminEmail=foo@bar.netUsing NPM / Local Install
# Get help
mwoffliner --help# Create a ZIM for https://bm.wikipedia.org
mwoffliner --mwUrl=https://bm.wikipedia.org --adminEmail=foo@bar.netTo use MWoffliner with an S3 cache, provide an S3 URL:
--optimisationCacheUrl="https://wasabisys.com/?bucketName=my-bucket&keyId=my-key-id&secretAccessKey=my-sac"If you've retrieved the MWoffliner source code (e.g., via a git clone), you can install and run it locally with your modifications:
npm i
npm run mwoffliner -- --helpDetailed contribution documentation and guidelines are available.
MWoffliner provides an API and can be used as a Node.js library. Here's a stub example for your index.mjs file:
import * as mwoffliner from 'mwoffliner';
const parameters = {
mwUrl: "https://es.wikipedia.org",
adminEmail: "foo@bar.net",
verbose: true,
format: "nopic",
articleList: "./articleList"
};
mwoffliner.execute(parameters); // returns a PromiseComplementary information about MWoffliner:
- MediaWiki software is used by thousands of wikis, the most famous ones being the Wikimedia ones, including Wikipedia.
- MediaWiki is a PHP wiki runtime engine.
- Wikitext is the markup language that MediaWiki uses.
- MediaWiki parser converts Wikitext to HTML, which displays in your browser.
- Read the scraper functional architecture for more details.
GPLv3 or later, see LICENSE for more details.
This project received funding through NGI Zero Core, a fund established by NLnet with financial support from the European Commission's Next Generation Internet program. Learn more at the NLnet project page.


