Skip to content

Restore Wiki from a backup

benoit74 edited this page Oct 29, 2024 · 7 revisions

⚠️ This procedure is applicable to our two Wikis : wiki.openzim.org and wiki.kiwix.org ; procedure below is adapted to openzim, so small changes in names have to be done for kiwix.


Set-up credentials

Backups are in borgbase. To download them, you need the read-only credentials:

# those are all static values you need to enter
# those are all for the _slave_ (aka readonly) bitwarden account
export BW_CLIENTID=user.xxxxxxxxx
export BW_CLIENTSECRET=xxxxxxxxxxxx
export BW_PASSWORD=xxxxxxxxxxxx

Select a backup

docker run -v $PWD/data/restore:/restore:rw -e BW_CLIENTID=$BW_CLIENTID -e BW_CLIENTSECRET=$BW_CLIENTSECRET -e BW_PASSWORD=$BW_PASSWORD ghcr.io/kiwix/borg-backup restore --name openzim-wiki --list

openzim-wiki is the name of the Borgbase repository in which we archive the Wiki backups.

Output would look like

List avaible archives ...
Remote: Warning: Permanently added the ECDSA host key for IP address '95.216.113.224' to the list of known hosts.
Warning: Attempting to access a previously unknown unencrypted repository!
Do you want to continue? [yN] yes (from BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK)
openzim-wiki__backup__2023-10-03T10:06:48 Tue, 2023-10-03 10:06:51 [70b2a142491e3a88359937de98dc50837c536a1fd03dd82fdfb3e1b301e21370]
openzim-wiki__backup__2023-10-04T10:06:48 Wed, 2023-10-04 10:06:49 [fa43670069485bd7dd6f98205eeb18eb2431f156284b7d090ffde6e406d31660]
openzim-wiki__backup__2023-10-05T10:06:44 Thu, 2023-10-05 10:06:45 [44d88bbe1e16d3837bda781eb02e7327148bf60b110f974b556fd379be604086]
openzim-wiki__backup__2023-10-06T10:06:46 Fri, 2023-10-06 10:06:47 [cd19a13b9447e0d91c0d909536071fbcc673414eaf5d288046c90cfe3981cf19]
openzim-wiki__backup__2023-10-07T10:06:43 Sat, 2023-10-07 10:06:44 [dccc4b74153d803e67d20c5eafcdef8fbf7c5af80fb0b0b90509410211315d2a]
openzim-wiki__backup__2023-10-08T10:06:44 Sun, 2023-10-08 10:06:45 [40d00ef6625ac95546f70ea7a186e4429f6956e1450df23d330563032eca37fe]
openzim-wiki__backup__2023-10-09T10:06:43 Mon, 2023-10-09 10:06:44 [6f40078ffc419d0e8ece478fa5936e3e977ca919ba499c080e4710fa0ebc80e1]

Choose one based on its date. Check the default backup periodicity in borg-backup tool and potential customization in k8s backup cronjob.

Note: the archive name is the first column (stops at first space). ex: openzim-wiki__backup__2023-10-09T10:06:43.

Extract a Backup file

With your selected archive name, download+extract it to your filesystem:

docker run -v $PWD/data/restore:/restore:rw -e BW_CLIENTID=$BW_CLIENTID -e BW_CLIENTSECRET=$BW_CLIENTSECRET -e BW_PASSWORD=$BW_PASSWORD ghcr.io/kiwix/borg-backup restore --name openzim-wiki --extract "openzim-wiki__backup__2023-10-09T10:06:43"

Wiki backup will be extracted to /data/restore in this example. It contains:

  • a dump of the Mysql database
  • a backup of "raw" Mediawiki files

If needed (e.g. on a Linux box), ensure that you own all restored files:

sudo chown -R $(id -u -n):$(id -g -n) $PWD/data

The MySQL dump has no extension ; move it to a more practical location.

mv $PWD/data/restore/root/.borgmatic/mysql_databases/openzim-wiki-service/my_wiki $PWD/data/restore/my_wiki.sql

Test the dump file

# start the image with MySQL server inside.
# note that this will directly import the database dump.
docker run -v $PWD/data/restore:/data -it --name my-tester --rm -p 3306:3306 -e DATABASE_TYPE=mysql -e MYSQL_INIT=1  -e MYSQL_IMPORT_FILE=/data/my_wiki.sql ghcr.io/offspot/mediawiki:1.36.1 /bin/bash

Check the DB structure and data using mysql client from inside the image.

** NOTA: ** for convenience, we use the mediawiki image to ensure that we use an appropriate MySQL version. Do not forget to use the same Docker image tag than the one used in production.

Production Restore Procedure

If the wiki is already running

  • shutdown the wiki deployment by scaling it down to 0
  • Cleanup the persistent volume (delete content directly from the node)
  • Download the dump file to the volume with borg-accessor job (see Downloading dump into volume below)
  • Restore the dump file with restore-wiki job (see Restore dump below)
  • Start the wiki by scaling it to 1

Downloading dump into volume

In order to get the dump file into the volume, one needs to launch borg-backup into the cluster. This would be done with a temporary k8s Job.

---
apiVersion: batch/v1
kind: Job
metadata:
  name: borg-accessor
  namespace: wiki
spec:
  backoffLimit: 1
  template:
    metadata:
      labels:
        app: borg-app
    spec:
      containers:
      - name: borg-backup
        image: ghcr.io/kiwix/borg-backup
        command: ["restore", "--name", "openzim-wiki", "--extract", "openzim-wiki__backup__2023-10-09T00:01:21"]
        imagePullPolicy: Always
        env:
        - name: BW_CLIENTID
          value: "xxxx"
        - name: BW_CLIENTSECRET
          value: "xxxx"
        - name: BW_PASSWORD
          value: "xxxx"
        volumeMounts:
        - name: data-volume
          mountPath: "/restore"
          readOnly: false
      volumes:
      - name: data-volume
        persistentVolumeClaim:
          claimName: openzim-wiki-pvc
      restartPolicy: Never
      nodeSelector:
        k8s.kiwix.org/role: "services"

Restore dump

In order to get the data from the dump into the MySQL database + place the files at the appropriate location, one needs to launch Mediawiki startup script. This would be done with a temporary k8s Job.

---
apiVersion: batch/v1
kind: Job
metadata:
  name: wiki-restore
  namespace: wiki
spec:
  backoffLimit: 1
  template:
    metadata:
      labels:
        app: wiki-restore
    spec:
      containers:
      - name: wiki-restore
        image: ghcr.io/offspot/mediawiki:1.36.1
        command: ["/bin/bash","-c", "mv /var/www/data/root/.borgmatic/mysql_databases/openzim-wiki-service/my_wiki /var/www/data/import.sql && start.sh true && rm /var/www/data/import.sql && rm -rf /var/www/data/config && rm -rf /var/www/data/download && rm -rf /var/www/data/images && rm -rf /var/www/data/site_root && mv /var/www/data/storage/* /var/www/data/ && rmdir /var/www/data/storage && rm -rf /var/www/data/root"]
        imagePullPolicy: Always
        env:
        - name: DATABASE_TYPE
          value: "mysql"
        - name: MYSQL_INIT
          value: "1"
        - name: MYSQL_REMOTE_ACCESS
          value: "1"
        volumeMounts:
        - name: data-volume
          mountPath: "/var/www/data"
          readOnly: false
      volumes:
      - name: data-volume
        persistentVolumeClaim:
          claimName: openzim-wiki-pvc
      restartPolicy: Never
      nodeSelector:
        k8s.kiwix.org/role: "services"