Online / 5 & 6 February 2022


Performance oriented InnoDB log format changes

How InnoDB crash recovery works

The persistent circular buffer (the ib_logfile0) is the fundament of the persistent InnoDB buffer pool.

Over the years, the log file format has been changed in MariaDB Server to improve the performance. A well-designed file format imposes minimal write amplification and is easy to parse.

The 512-byte block size of the InnoDB log was a perfect match for the industry standard that was defined by the venerable Seagate ST-225. Alas, the industry moved on, and now block sizes range from 64 bytes (the size of a memory cache line) to 4096 bytes. Therefore, a format that works efficiently with any block size is needed.

We present a flexible format where each mini-transaction (comprising log records) is a block on its own, with a checksum that is calculated in a local buffer, reducing contention on the mutex that protects the global log buffer. The old 12-byte block header is shrunk to a 1-bit sequence number, for detecting the end of the circular log. The additional overhead is 4 bytes per mini-transaction for a CRC-32C checksum.

No-op records may be written to pad the log buffer to match the physical block size. Encryption will only cover data. The length of each record is explicitly stored in clear, which allows consistent hot backups without having any encryption keys. For encrypted log, 8-byte part of initialization vector will be written after each mini-transaction.


Marko Mäkelä