MaDada infrastructure description

This document is written in English in the hope that it might help other Alaveteli site owners.

Cette page est écrite en Anglais puisqu'elle peut aussi être utile aux autres site alaveteli.

This page describes the servers and infrastructure behind Ma Dada. "Infrastructure" is a bit of a pompous name for it, as it is fairly simple and limited :)

Very short summary

Servers: gandi VPS, 1 prod, 1 staging
code: alaveteli (ruby on rails) / postgresql
deployment: ansible playbook hosted on gitlab
stats: metabase served from the same VPS
forum: Discourse hosted by IndieHosters
documentation: Material for MkDocs

Software platform

Servers

Our servers are all VPSs provided by Gandi. We use 2 servers:

a production machine (the one behind madada.fr) which hosts all the production services:
- the application itself (alaveteli)
- the web server (nginx)
- the mail server (postfix)
- the pop server (dovecot) for passing incoming emails to alaveteli
- the database (postgresql)
- the public stats page (metabase)
a staging server (https://dadastaging.okfn.fr) which duplicates the production server's configuration as much as possible (same software versions, same file structure, etc...). The only differences are secrets, server names and network settings.

Source code

The source code for the Alaveteli software which powers the site is hosted on github.

The code specific to Ma Dada is hosted on gitlab. We have several repositories, the main one is called dada-core and contains all the code to configure and deploy the site.

Deployment

We deploy the service using an ansible playbook.

This page won't try to explain ansible in totality, their docs should help you with this if you're not familiar with the system. Instead here are only some of our specifics:

The main setup playbook is in a role called alaveteli, under ansible/roles/alaveteli, and reading through ansible/roles/alaveteli/tasks/main.yml should give you a step by step view of how we do our deployment.
The role relies on 2 other roles:
- anxs.postgresql to setup the database
- geerlingguy.swap to setup a small swap file to help avoid memory issues during some deployment steps (search for that name in the playbook to see more details)

The goal of our ansible setup is to run from our gitlab CI pipeline, so that nobody should need to mess with the servers directly, nor deploy from their own machines.

The code in this repo (under the ansible folder) should hold all the info needed to rebuild the server from scratch.

Our goal is to avoid touching the servers directly, and instead change config entirely through the playbooks, which act as documentation of our work, helps with traceabililty, and reproducibility.

Secrets management

Secrets are stored in 2 different ways: - ansible vault files for most of them, which are under ansible/group_vars/*/vault.yml - gitlab CI environment variables for the ones that are needed before decrypting the vault files (eg. the vault password)

All secret variables in the playbook are prefixed with vault_ to help identify them. To make things easier to follow, we also copy them to a non-prefixed variable with the same name (eg. vault_somepassword -> somepassword) in the main.yml config file where all variables are set. This has two consequences:

there are no vault_* variables floating around the code,
all variables are set in one place, making it easier to find the relevant ones.

Theme and French translations

Alaveteli includes a theming system that allows each site to localise and customise its appearance and behaviour.

The Ma Dada theme is here.

The French translation is done through several systems which are detailed in this page (in French).

DNS

All DNS configuration is done via Gandi's DNS servers.

Stats and dataviz

We run a public page of statistics (dataviz for the cool kids). This is a self-hosted instance of metabase, which runs on the same VPS as the production site.

Metabase does not have access to the production database. Instead, we run a nightly cronjob which dumps the contents of the main database and removes any confidential data (names, embargoed requests, content of messages...) so that only "safe" metadata remains. This dump is then loaded into a separate database for metabase to read.

Backups

Backups are done for two elements:

postgresql database, using wal-g. It is setup to do Point-In-Time-Restore (PITR), meaning the database can be restored to any point in its history, with a resolution of 1 minute.
raw emails are saved as files and backed up using restic.

All backups are done off-site, to an S3 compatible block storage hosted by Exoscale in Geneva (our servers are in Paris). This is to minimise the risk of data loss should there be a major incident at the main datacenter (such as a fire...).

All configuration details are in this repository, and most easily found by searching for "walg" and "restic" respectively.

As an overview, both are configured using an envdir, which converts files into environment variables for the target program. The name of the file determines the env var's name, and the file's content is the value of the resulting variable. All of these are defined in the playbook. For instance, search for db_backup_walg_envdir. The actual values are defined in the playbook variables.

Services we don't host directly

The forum is running discourse managed by IndieHosters.

The documentation is built using Material for MkDocs and deployed as a static site on gitlab pages.

We have a social media presence on mastodon and twitter and we host our videos on peertube.

Our blog is built with jekyll and hosted on gitlab pages.