Posted on November 29, 2020 | Guille

By hosting your static site in GitLab Pages you don’t have to worry about managing a web server, but just as “with great power comes great responsibility”, with less responsibility you must also relinquish some control. If you inspect your site’s headers you will find that GitLab’s servers don’t compress your assets, which makes them weigh up to 75% more than necessary! The good news is that you can get that compression without having to touch the server settings.

Why compress?

One of the things that affects page load time most is the time required to download all the pieces that make up the site (html, css, js, images, fonts, etc.), especially for users on low-speed Internet connections. Making each file as small as possible can make a huge difference to the total page load time. For images, this is usually done beforehand with specialized compression algorithms that target image data such as JPEG, PNG, and WEBP. It’s ridiculous to serve raw bitmaps because all browsers support compressed images and the difference in size can be up to 10x (image compression can be lossy, which allows them to reduce size much more by losing a little bit of image quality). For all other files you have to use a lossless compression algorithm so that the end result is identical to the original file (just a one bit difference would corrupt the file and make the browser fail at rendering the page). Although the difference in size won’t be as large as with images, it is still very much noticeable.

Web compression algorithms

The two main compression algorithms used for non-image files are gzip and Brotli. Gzip is a general-purpose lossless compression algorithm that has been around since 1992, and is supported by all browsers since 2001. Brotli is a relatively new algorithm (released in 2013) designed specifically for compressing web page assets. While gzip creates a dictionary for each file, Brotli has a pre-defined dictionary with the most common strings found in html, js, css, and font files. This gives it an advantage over GZIP of around 17% when compressing these types of files. All modern browsers support Brotli, but to support older versions it is a good idea to have both options available, so that the browser can decide which one it downloads according to its capabilities. Usually this is done by the web server software on-the-fly (as it receives the request from the browser), but it can also be done beforehand, just as in the case of images.

Compression in GitLab Pages

To support compression in GitLab Pages all you have to do is serve the pre-compressed files. If as well as having an index.html you also have the compressed index.html.gzip, the browser will automatically download the compressed version. By having three versions of each file (one uncompressed, one GZIP-compressed and one Brotli-compressed) the browser will know which version is the best for it and download that one. Now, you could do this manually on your computer before pushing to the repository that is linked to your GitLab Pages, but you can just as well use GitLab’s CI pipeline to do it for you. In fact, if your static assets are being created in the CI pipeline (for example, when using a SSG such as Hugo) you have to add the compression there.

Compressing files in GitLab CI

To compress your files in the GitLab CI pipeline you just have to install the necessary tools (gzip is included in most Linux images, but Brotli is rare) and run them. One of the advantages of compressing files beforehand instead of on-the-fly is that there is no need to compromise on compression ratio vs. compression speed. Web servers tend to compress less to minimize CPU usage and response time, but this way the time it takes to compress will only affect our CI pipeline execution time. Here is an example .gitlab-ci.yml file for compressing a static page created with Hugo:

image:
  name: klakegg/hugo:alpine
  entrypoint: [""]

before_script:
 - apk update
 - apk add brotli

pages:
  script:
  - hugo --minify
  - gzip -k -9 $(find public -iname '*.html' -o -iname '*.css' -o -iname '*.js' -o -iname '*.xml')
  - brotli -Z $(find public -iname '*.html' -o -iname '*.css' -o -iname '*.js' -o -iname '*.xml')
  artifacts:
    paths:
    - public
  only:
  - master

Since the Alpine Linux docker image doesn’t include brotli it must be installed with APK (use APT for Debian-based images, RPM for Red Hat, etc.). Then after building the static files with the Hugo command, we run the gzip and brotli commands with similar arguments. The -k in gzip tells it to keep the original files (this is the default behavior of brotli). The -9 in gzip and -z in brotli specify the highest compression level. Instead of giving a list of files to compress, you can use the find command to list all files of a certain type in your public directory. In this case the file types are html, css, js, and xml (for the sitemap and the RSS feed). If you had web fonts you would want to add those (ttf, woff, and woff2), and the same goes for svg files (in my case all svg images are inlined in the html files).

Tags: