Nicolas Le Manchet

What I've Learned Today: an Engineering Logbook

Oct 12 2018 - Django database connections in background workers

Executing of a task in a background worker is similar to handling a request in a WSGI server. At the beginning the task or request must fetch a database connection and at the end it must clean it up.

Django uses signals sent at the beginning and end of a request to manage database connections. This means that task queues can do the same and be confident that background tasks get the same isolation as HTTP requests.

In Spinach this translates to:

from django.db import reset_queries, close_old_connections
from spinach import signals

@signals.job_started.connect
def job_started(*args, job=None, **kwargs):
    reset_queries()
    close_old_connections()

@signals.job_finished.connect
def job_finished(*args, job=None, **kwargs):
    close_old_connections()

Oct 10 2018 - Testing reusable Django apps

I added a Django integration to Spinach, my Redis task queue for Python. It comes in the form of a small Django application. Naturally I wanted to write automated tests to make sure I don't break it by mistake in the future.

Since the code lives in the main Spinach repository, I didn't want to setup a full Django project just for the sake of testing. It turns out this is not needed, Django can be bootstrapped right from the test suite:

import django
import django.conf

def test_django_app():
    django.conf.settings.configure(
        INSTALLED_APPS=('spinach.contrib.spinachd',),
    )
    django.setup()

    # Start testing here

Oct 2 2018 - Disallowing access to Django admin for non-staff users

Websites built with Django often have their own login page, making the admin login page unnecessary. Django does not provide a setting to disable the admin login page while keeping the rest of the dashboard.

A simple middleware can prevent anonymous and regular users from seeing the admin dashboard:

from django.http import Http404
from django.urls import reverse


def hide_django_admin_middleware(get_response):

    def middleware(request):
        if request.path.startswith(reverse('admin:index')):
            if not request.user.is_staff:
                raise Http404()

        return get_response(request)

return middleware

Sep 27 2018 - Optimizing Caddy logs for Goaccess

I recently discovered Goaccess, a simple tool that generates analytics reports from web server logs.

The more information the log file contains, the better the report is. The default log format of Caddy does not record virtual hosts, user agent, referer and latency so they need to be added explicitly:

log / /var/log/caddy/access.log "{hostonly} {combined} {latency_ms}"

Goaccess can then use this extra information with:

goaccess access.log -o goaccess.html --log-format='%v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" %L' --time-format='%H:%M:%S' --date-format='%d/%b/%Y'

Mar 13 2018 - Postgres on ZFS

I always thought that Copy-On-Write filesystems like ZFS or Btrfs were poor fit for databases because of the amount of in-place updates the database does.

This article from 2nd Quandrant actually shows that ZFS can work well with Postgres. The key seems to keep the default ZFS block size of 128k (unlike the usually recommended 8k) and use a fast disk for ZFS SLOG and cache.

The big advantage of running a database on a COW filesystem is the ability to take an atomic snapshot of the database at any moment easily.

Mar 11 2018 - Docker userland proxy

I was trying to make my task queue based on Redis more resilient to network errors. Both my application and Redis use TCP keepalive to detect disconnects on long lived connections.

To do that, I was simulating network partitions by dropping traffic to/from the container Redis runs in. This is easily done by entering the Redis process network namespace and adding an iptables rule:

nsenter -t <Redis PID> --net iptables -A INPUT -j DROP

This actually worked and made my application and Redis unable to talk with each other, but not detect the disconnect if no traffic was happening (as TCP keepalive should). The reason is related to how Docker publishes ports to the host.

When publishing a port from a docker container using the default bridged network, the containerized process does not actually bind on the host. The bind is done by a Docker process called the Docker userland proxy which runs on the host as root.

When an application connects to the containerized Redis, it in fact connects to another process which forwards packets to Redis. Thus my application did not detect the disconnect because the TCP keepalive probes between it and the userland proxy were successful, only the traffic between the proxy and Redis was dropped.

Mar 8 2018 - Glibc getpid syscall caching

Glibc used to cache the result of the getpid syscall. Because the caching did not play well, it was removed from newer versions of glibc. As a result the call got more expensive and it is even visible from Python.

Debian Jessie shipped the old glibc while Stretch has the new one without cache. Alpine uses musl libc, which apparently does not cache the syscall:

$ docker run -ti --rm python:3.6-jessie python -m timeit -s "import os" "os.getpid()"
10000000 loops, best of 3: 0.0523 usec per loop

$ docker run -ti --rm python:3.6-stretch python -m timeit -s "import os" "os.getpid()"
10000000 loops, best of 3: 0.193 usec per loop

$ docker run -ti --rm python:3.6-alpine python -m timeit -s "import os" "os.getpid()"
1000000 loops, best of 3: 0.203 usec per loop

Mar 7 2018 - Oven clocks getting slowed down in Europe

I noticed that the clock of my kitchen oven got late, it lost about three minutes in just two weeks. It surprised me since this clock has been very accurate for two years.

I got the answer from a Hacker News thread: oven clocks often don't have a quartz but instead count how many times the tension goes to zero, assuming that at 50 Hz it reaches zero 100 times per second.

This works well when the frequency is stable, but currently Europe suffers from a deficit in energy production slowing the grid down. Thus clocks getting late.

Feb 15 2018 - Docker and writing to stdout

On a Docker container, a process writing to stdout can get blocked. The Docker daemon is responsible for consuming the stream and if the daemon cannot consume the data fast enough (during a restart for instance) calls to print or even a logger.info can pause the execution until the buffer is cleared. By default the buffer size is 64 KB.

Feb 6 2018 - Cleaning untrusted HTML markup

RSS and Atom feeds embed HTML markup that readers should display correctly. For web based readers, just including the HTML is very dangerous since an evil feed could easily embed Javascript code.

Mozilla made bleach, a nice library that has a strict whitelist of HTML tags and attributes that can safely be used within a web page.

Feb 4 2018 - Safely loading XML files in Python

XML is a tricky format full of unexpected behavior. The only safe way to load untrusted XML files in Python is to do it via defusedxml.

This is what I use in Atoma to safely load RSS and Atom feeds.

Feb 1 2018 - Gracefully stop Python Docker container

Some Python libraries like Waitress or my own Spinach expect to be stopped by raising a KeyboardInterrupt. This exception is raised automatically by Python when the interpreter receives a SIGINT.

On the other hand, Docker by default sends a SIGTERM to a container and waits 10 seconds for it to stop before killing it with a SIGKILL. To gracefully stop a Python process running in Docker, KeyboardInterrupt can be raised when receiving a SIGTERM:

import signal

def handle_sigterm(*args):
    raise KeyboardInterrupt()

signal.signal(signal.SIGTERM, handle_sigterm)

Jan 25 2018 - New Python 3 email API

In the latest releases of Python the email package was redesigned. It is now more user-friendly, making the complicated topic of emails encodings easier to approach.

For instance the old API required internationalized headers to be decoded manually:

>>> import email
>>> data = b'Subject: Asperges pour le =?utf-8?q?d=C3=A9jeuner?=\n\n'
>>> m = email.message_from_bytes(data)
>>> m['Subject']
'Asperges pour le =?utf-8?q?d=C3=A9jeuner?='

The new API takes care of this automatically:

>>> import email
>>> from email import policy
>>> data = b'Subject: Asperges pour le =?utf-8?q?d=C3=A9jeuner?=\n\n'
>>> # For backward compatibility reasons, the policy needs
>>> # to be passed explicitly to use the new API
>>> m = email.message_from_bytes(data, policy=policy.default)
>>> m['Subject']
'Asperges pour le déjeuner'

Jan 24 2018 - CPU microcode update

The microcode running on the CPU can be updated by either flashing the BIOS or injecting the code very early on boot.

The second approach is easy on Linux: the microcode gets copied into the initramfs that is run before booting the Linux kernel. As the initramfs runs early enough on boot, it is not too late to update the CPU microcode.

The Debian wiki explains how to do that, it boils down to:

apt install intel-microcode

After a reboot the update can be confirmed in kernel logs:

CPU1 microcode updated early to revision 0x20, date = 2017-01-27