Nicolas Le Manchet

PostgreSQL autovacuum not running

In PostgreSQL whenever a row is updated a new tuple is inserted and the old one is marked as dead. Likewise when a row is deleted the underlying tuple is only marked as dead. VACUUM removes dead tuples allowing to reclaim space and speed up queries.

This operation can be done manually using the command VACUUM (FULL) but it's much better to let autovacuum clean up dead rows automatically when needed.

Autovacuum is enabled by default on PostgreSQL but the default configuration is conservative as it waits for a table to have 20% of dead tuples before triggering a vacuum on it.

Checking the last manual and auto vacuum on the tables of a database:

SELECT relname, last_vacuum, last_autovacuum FROM pg_stat_user_tables;

          relname           |          last_vacuum          |        last_autovacuum
----------------------------+-------------------------------+-------------------------------
 django_site                | 2018-12-05 08:15:19.946175+00 |
 django_content_type        | 2018-12-05 08:15:19.926017+00 |
 reader_attachment          | 2018-12-05 08:15:21.058618+00 | 2019-01-31 14:56:29.451346+00

django_site and django_content_type are small read only tables, so they must contain very few dead rows which explains why they have never been vacuumed automatically.

reader_attachment however sees many updates and at the time of writing the last auto vacuum was run about six months ago.

Current autovacuum settings can be checked with:

SELECT name, setting from pg_settings where category like 'Autovacuum';

                name                 |  setting
-------------------------------------+-----------
 autovacuum                          | on
 autovacuum_analyze_scale_factor     | 0.1
 autovacuum_analyze_threshold        | 50
 autovacuum_vacuum_scale_factor      | 0.2
 autovacuum_vacuum_threshold         | 50

Here we can make sure that autovacuum is actually turned on autovacuum = on. The next interesting settings is autovacuum_vacuum_scale_factor which is a percentage of dead rows in a table to reach before a vacuum is triggered.

Checking the percentage of dead rows per table:

SELECT relname as table, n_live_tup as live, n_dead_tup as dead, ROUND(CAST((n_dead_tup * 100.0 / (n_live_tup+1)) AS FLOAT)) AS percent_dead FROM pg_stat_user_tables;

           table            |  live   | dead  | percent_dead
----------------------------+---------+-------+--------------
 django_site                |       1 |     0 |            0
 django_content_type        |      18 |     0 |            0
 reader_attachment          |  575984 | 65414 |           11

Here lowering autovacuum_vacuum_scale_factor = 0.05 would trigger autovacuum on reader_attachment since it contains more than 5% of dead tuples.

More information: