Compress prune/freeze records with Delta Frame of Reference algorithm

Started by Evgeny Voropaev3 days ago1 messages
Jump to latest
#1Evgeny Voropaev
evgeny.voropaev@tantorlabs.com

Hi hackers,

A prune/freeze record contains four sequences of integers representing
frozen, redirected, unused, and dead tuples. Currently, these
sequences are uncompressed, which adds overhead. Eliminating empty
leading bits in these offsets can reduce the size of the record, and
applying bit-packing algorithms can reduce it even further. Reducing
WAL record size also lessens the load on disk and network, which are
heavily used for storing WAL and transmitting it to replicas. For
example, with BLCKSZ equal to 8192, the maximum tuple offset is around
580. Despite this, all offsets are stored as 16-bit integers with no
compression applied.

In the proposed patch, using the customized Delta Frame of Reference
(DFoR) algorithm, the unused and dead sequences are compressed. The
frozen and redirected sequences cannot be compressed with DFoR because
the order of their elements is significant, and DFoR does not yet
support unsorted sequences. The theoretical compression ratio for
dfor_u16 can reach up to 16.

The new GUC wal_prune_dfor_compression controls (enables or disables)
compression for prune/freeze records.

An integral TAP test, 052_prune_dfor_compression.pl, has been
implemented. It demonstrates an average compression ratio of at least
5 when analyzing prune/freeze records in practice.

The patch series includes:
1. 0001-Implement-vect-and-uniqsortvect-containers-and-bitpack.patch
- Introduces the vect and uniqsortvect containers and the bitpack
unit used by DFoR.

2. 0002-Implement-Delta-Frame-of-Reference-compression.patch
- Implements DFoR itself.

3. 0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-prune.patch
- Applies DFoR to prune/freeze records.

Unit tests are included and integrated into the PostgreSQL build
system.

Using vect, bitpack, and DFoR for prune/freeze records is just one
example of their application. Independently of prune/freeze, these
algorithms can be effectively used by developers in other areas of
PostgreSQL.

Looking forward to discussion!

Best regards,
Evgeny Voropaev,
Tantor Labs, LLC.

Attachments:

v01-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v01-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2824-2
v01-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v01-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1199-2
v01-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v01-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+616-35