Publish a Zola blog with Gitlab CI, real fast

2024-11-02

The two or three regular visitors of this blog (hi friends!) might have noticed I’ve changed the design, as of this year. I’ve actually moved from using the Pelican static blog generator to using Zola. This article shows the continuous integration I’ve set up to automatically build and push content to my server, Real Fast™.

Was there anything wrong with Pelican? Not really. It has served me well over the years, and its naming coming from an anagram of calepin, French for a “small notebook”, is still just genius. Installing on a new machine was a bit of a pain as I needed to recall which Python / virtualenv / etc. commands were required to install dependencies, but over the years I’ve made a Makefile to simplify that. And it didn’t happen that often though; at most I’d run such a command once every two years or so (which is, arguably, already a lot if that’s the frequency at which one switches machines).

I mostly moved to Zola because I’m a zealot / member of the Rust evangelism strike force on my spare time, and want to support tools created with Rust, which are usually blazing 😎 fast 🚀 ^[1], in addition to being super safe. And, having a statically linked binary is super convenient, and notably super nice for continuous integration and deployment purposes. Indeed, it’s trivial to cache a single binary in CI, and not have to worry about installing/caching dependencies and so on. In addition to that, Zola supports front-matter annotations with TOML or YAML, so it offered a nice migration path from Pelican, who uses YAML for frontmatters.

I’ve been using Gitlab, notably the Framagit instance hosted by the good fellows at Framasoft (give them money to support them!), for the repository hosting the sources of my blog. As such, I’ve wanted to be able to push to the repository, and have the CI build and publish to my website.

Now, let me explain how I’ve did it. These instructions are valid for GitLab Community Edition v17.5.1 ; some things may change in newer versions, so I can’t guarantee they’ll work forever. There’s a first stage that will build the public website using a cached Zola binary, if possible, or grab it from the Github releases website otherwise. The second stage^[2] will upload it to the server, using rsync. I’d recommend creating a new user just for this task, with limited SSH access to a single directory, that is, where the generated HTML will live. With caching, each stage takes at most 10 seconds to run, which I find… acceptable 😁.

Here’s the .gitlab-ci.yml file I’ve checked in, heavily commented for your (and my future self’s) best understanding:

default:
  image: debian:stable-slim

variables:
  # The runner will be able to pull your Zola theme when the strategy is
  # set to "recursive".
  GIT_SUBMODULE_STRATEGY: "recursive"

  # If you don't set a version here, your site will be built with the latest
  # version of Zola available in GitHub releases.
  # Use the semver (x.y.z) format to specify a version. For example: "0.17.2" or "0.18.0".
  ZOLA_VERSION:
    description: "The version of Zola used to build the site."
    value: "0.19.1"

build:
  stage: build

  # Cache the Zola binary based on its version, to avoid conflicts between different versions.
  cache:
    key: $ZOLA_VERSION
    paths:
     # $CI_PROJECT_DIR is the current working directory in subsequent steps.
      - "$CI_PROJECT_DIR/zola"

  script:
    - |
      if [ ! -e "$CI_PROJECT_DIR/zola" ]; then
        echo "Downloading Zola…"

        # Download enough to use `wget`.
        apt-get update --assume-yes && apt-get install --assume-yes --no-install-recommends wget ca-certificates

        if [ $ZOLA_VERSION ]; then
          zola_url="https://github.com/getzola/zola/releases/download/v$ZOLA_VERSION/zola-v$ZOLA_VERSION-x86_64-unknown-linux-gnu.tar.gz"
          if ! wget --quiet --spider $zola_url; then
            echo "A Zola release with the specified version could not be found.";
            exit 1;
          fi
        else
          github_api_url="https://api.github.com/repos/getzola/zola/releases/latest"
          zola_url=$(
            wget --output-document - $github_api_url |
            grep "browser_download_url.*linux-gnu.tar.gz" |
            cut --delimiter : --fields 2,3 |
            tr --delete "\" "
          )
        fi

        wget $zola_url
        tar -xzf *.tar.gz
      else
        echo "Reusing cached Zola…"
      fi
    - |
      $CI_PROJECT_DIR/zola build

  # The built artifacts will be put in the `public/` directory, and reused during the next stage.
  artifacts:
    paths:
      - public/
    expire_in: 1 day

deploy:
  stage: deploy
  only:
  - main
  dependencies:
  - build
  script:
  # Install rsync and ssh, if needs be.
  - apt-get update -qq && apt-get install -y -qq rsync
  - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
  - eval $(ssh-agent -s)
  # Set SSH private key, and define the right permissions to not trigger security errors.
  - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /dev/null
  - mkdir -p ~/.ssh
  - chmod 700 ~/.ssh
  # Set SSH known hosts, and define the right permissions to not trigger security errors.
  - echo "$SSH_KNOWN_HOSTS" > ~/.ssh/known_hosts
  - chmod 644 ~/.ssh/known_hosts
  # Run rsync.
  # Note: -avzh = append / verbose / compress / human-readable
  - rsync -avzh --delete public/* -e "ssh -p $SSH_PORT" $SSH_USERNAME@$SSH_HOST:$SSH_TARGET_DIR

After you’ve set this up, you need to fill some secrets on your CI, by going to Settings, then CI/CD, then fill all the following variables under Variables. Make sure to create them as Masked variables, if not Masked and hidden, unless you’d like the SSH private key to leak 🤡.

SSH_HOST: server’s host, e.g. mysupersecretserver.bouvier.cc
SSH_PORT: the port to be used for connecting with SSH, e.g. 22 by default.
SSH_KNOWN_HOSTS: a copy of a simplified ~/.ssh/known_hosts file, stripped down to only contain lines related to the above SSH_HOST. This is to avoid having the CI task to confirm “Are you sure you want to trust this host” when connecting for the first time to the server.
SSH_TARGET_DIR: the final directory where the generated HTML should live, e.g. /var/www/myblog.
SSH_USERNAME: the username of the (Unix) user used in the CI task to upload generated artifacts to your Web server…
SSH_PRIVATE_KEY: …and its private key.

Hope this was useful!

Don’t pay attention to the emojis, they’re a meme at this point. ↩
Using two stages is inherited from my previous setup using Pelican, and might be overkill since the each stage takes at most 10 seconds to run, with a hot cache. ↩