Database logging

IntroductionUp

We log changes in the database in two ways: via write-ahead logging (WAL), a feature of PostgreSQL, and via our own triggers within the database.

These methods produce 2 parallel logs of, in part, the same changes. WAL tracks changes in more tables than trigger logging does. WAL logs all non-temporary tables that are not created with the unlogged attribute. Trigger logging logs only those tables we have decided to log, and we do not log derivative tables or others outside the public schema.

We have not tried to create a logging system to combine these two systems. WAL does not capture user IDs when changes are created via PanLem, while trigger logging does. In addition, trigger logging inserts records into a database table that can be immediately used, while the use and interpretation of WAL records is much more complex.

Write-ahead logging

Trigger logging

Each trigger-logged table has a trigger that causes any change in any record of the table to be logged. The change is recorded in the hist.log table.

We began trigger logging in July 02014. Our system of log preservation keeps only the latest 3 days of records in the hist.log table. Once a week, records older than 3 days old are exported to a text file and deleted from the table. As of September 02016, the entire collection of weekly log-export files occupied about 46GB of storage, not compressed. The text files are retained on the server and archived with duplicity both locally on the server’s FTP storage volume and remotely in an Amazon AWS S3 bucket.

It is possible to inspect and analyze the log records via the export files, or by copying those files back into a database table. The util.logsget(text) function creates a util.logs table and inserts into it records from any specified log-export file. It is most practical to concatenate all the desired export files into a single file and then specify that one as an argument to util.logsget(text). As of September 02016, doing this with the entire collection of 26 months of export files took 80 minutes and produced a table that, with its indexes, occupied 110GB of storage, almost doubling the size of the plx database. The util.logs table is itself not trigger-logged and also not WALS-logged, because it is created with the unlogged attribute. If we migrate the server, it is therefore not restored to the new server’s plx database. It may be dropped when no longer in use and recreated when needed, or left in existence for reuse. In the latter case, any export files that have been created since it was populated can be inserted into it with the statement copy util.logs from 'filepath' (format csv, delimiter '|', where filepath is replaced with each file’s pathname. If that is not enough and you want the content of the hist.log table, too, you can execute insert into util.logs select * from hist.log.