Compress prune/freeze records with Delta Frame of Reference algorithm

Started by Evgeny Voropaevabout 2 months ago22 messageshackers
Jump to latest
#1Evgeny Voropaev
evgeny.voropaev@tantorlabs.com

Hi hackers,

A prune/freeze record contains four sequences of integers representing
frozen, redirected, unused, and dead tuples. Currently, these
sequences are uncompressed, which adds overhead. Eliminating empty
leading bits in these offsets can reduce the size of the record, and
applying bit-packing algorithms can reduce it even further. Reducing
WAL record size also lessens the load on disk and network, which are
heavily used for storing WAL and transmitting it to replicas. For
example, with BLCKSZ equal to 8192, the maximum tuple offset is around
580. Despite this, all offsets are stored as 16-bit integers with no
compression applied.

In the proposed patch, using the customized Delta Frame of Reference
(DFoR) algorithm, the unused and dead sequences are compressed. The
frozen and redirected sequences cannot be compressed with DFoR because
the order of their elements is significant, and DFoR does not yet
support unsorted sequences. The theoretical compression ratio for
dfor_u16 can reach up to 16.

The new GUC wal_prune_dfor_compression controls (enables or disables)
compression for prune/freeze records.

An integral TAP test, 052_prune_dfor_compression.pl, has been
implemented. It demonstrates an average compression ratio of at least
5 when analyzing prune/freeze records in practice.

The patch series includes:
1. 0001-Implement-vect-and-uniqsortvect-containers-and-bitpack.patch
- Introduces the vect and uniqsortvect containers and the bitpack
unit used by DFoR.

2. 0002-Implement-Delta-Frame-of-Reference-compression.patch
- Implements DFoR itself.

3. 0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-prune.patch
- Applies DFoR to prune/freeze records.

Unit tests are included and integrated into the PostgreSQL build
system.

Using vect, bitpack, and DFoR for prune/freeze records is just one
example of their application. Independently of prune/freeze, these
algorithms can be effectively used by developers in other areas of
PostgreSQL.

Looking forward to discussion!

Best regards,
Evgeny Voropaev,
Tantor Labs, LLC.

Attachments:

v01-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v01-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2824-2
v01-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v01-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1199-2
v01-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v01-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+616-35
#2Andrey Borodin
amborodin@acm.org
In reply to: Evgeny Voropaev (#1)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

On 7 Mar 2026, at 18:56, Evgeny Voropaev <evgeny.voropaev@tantorlabs.com> wrote:

Looking forward to discussion!

Great idea and nice library for DFoR!

Prune/freeze records can be large, and reducing their size would help streaming
replication a lot. The algorithm itself (Delta Frame of Reference bit-packing) is
sound for sorted integer sequences.

Eventually we will have wholesale WAL compression, but proposed here type of
compression is more perspective for particular case, because it exploits knowledge
about nature of compressed data.

The patchset is, obviously, a prototype that worth working on further.
Perhaps, first thing to start is to fix CI failures [0]https://github.com/x4m/postgres_g/runs/67140025428.

I do not know if we can budle LGPL lib. Luckily it's only for tests, when all
design questions are resolved we can just re-implement tests with something else.

As a minor nit: do not use stdlib assert(), use capital Assert() :)

Are we 100% sure qsort() won't allocate something anywhere? sort_template.h seems
to be allocation-free, but just in case...

Best regards, Andrey Borodin.

[0]: https://github.com/x4m/postgres_g/runs/67140025428

#3Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Andrey Borodin (#2)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Hello Andrey,

Great idea and nice library for DFoR! Thank you for your attention.

I wish the patch would be useful.

All your proposals and recommendations have been implemented in v02.
Also the meson settings has been updated for supporting the new
developments.

Perhaps, first thing to start is to fix CI failures

Once meson is fixed, tests should pass successfully now. Looking
forward to this.

As a minor nit: do not use stdlib assert(), use capital Assert()

Done.

Are we 100% sure qsort() won't allocate something anywhere?
sort_template.h seems to be allocation-free, but just in case...

No guarantees from libraries or from descriptions about qsort's
behaviour regarding dynamic memory allocation. So, the qsort is just
substituted with the sort_template, which we trust, and as you
proposed.

Waiting for tests to have passed, and then I hope we could move further.

Best regards, Evgeny Voropaev.

Attachments:

v02-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v02-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2893-2
v02-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v02-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1201-2
v02-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v02-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+627-35
#4Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#3)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Rebase the patch onto 516310ed4db, added checks that a destintation
pointer passed to memcpy is not equal to NULL in the function
vect_reserve.

Looking forward to CFBOT results.

Attachments:

v03-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v03-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+627-35
v03-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v03-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1201-2
v03-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v03-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2901-2
#5Andres Freund
andres@anarazel.de
In reply to: Evgeny Voropaev (#1)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Hi,

On 2026-03-07 21:56:18 +0800, Evgeny Voropaev wrote:

A prune/freeze record contains four sequences of integers representing
frozen, redirected, unused, and dead tuples. Currently, these
sequences are uncompressed, which adds overhead. Eliminating empty
leading bits in these offsets can reduce the size of the record, and
applying bit-packing algorithms can reduce it even further. Reducing
WAL record size also lessens the load on disk and network, which are
heavily used for storing WAL and transmitting it to replicas. For
example, with BLCKSZ equal to 8192, the maximum tuple offset is around
580. Despite this, all offsets are stored as 16-bit integers with no
compression applied.

I'm unconvinced that this is a serious problem - typically the overhead of WAL
volume due to pruning / freezing is due to the full page images emitted, not
the raw size of the records. Once an FPI is emitted, this doesn't matter.

What gains have you measured in somewhat realistic workloads?

Greetings,

Andres Freund

#6Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Andres Freund (#5)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Hello Andres,

I'm unconvinced that this is a serious problem - typically the overhead of WAL
volume due to pruning / freezing is due to the full page images emitted, not
the raw size of the records. Once an FPI is emitted, this doesn't matter.

What gains have you measured in somewhat realistic workloads?

So far, we have had no tests in any real production environment.
Moreover, the load in the new test
(recovery/t/052_prune_dfor_compression.pl) is quite contrived. However,
it demonstrates a compression ratio of more than 5, and it is measured
for an overall size of all prune/freeze records with no filtering.

Further development is the implementation of compression of unsorted
sequences. This is going to allow PostgreSQL to compress also the
'frozen' and the 'redirected' offset sequences, which should result in a
greater compression ratio.

But I agree with you, Andres, we need practical results to estimate a
profit. I wish we would test it on some real load soon.

Also I hope, independently of its usage in prune/freeze records, the
DFoR itself might be used for compression sequences in other places of PG.

Best regards,
Evgeny Voropaev.

#7Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#6)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Rebase onto 2102ebb1953.
Fix warnings about gnu_printf in tap.c.
Fix wrong destination pointer in memcpy in vect_init.
Fix intl library linking problem for the dfor tests under FreBSD.

Attachments:

v04-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v04-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2902-2
v04-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v04-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1201-2
v04-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v04-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+627-35
#8Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#7)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Rebase onto 10e4d8aaf46. Continue fixing problems in CI jobs.

Best regards,
Evgeny Voropaev.

Attachments:

v05-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v05-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2908-2
v05-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v05-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1202-2
v05-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v05-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+627-35
#9Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#8)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Based upon 10e4d8aaf46 still. Continue fixing problems in CI jobs.

Attachments:

v06-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v06-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2911-2
v06-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v06-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1202-2
v06-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v06-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+627-35
#10Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Evgeny Voropaev (#6)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

On 3/24/26 15:28, Evgeny Voropaev wrote:

Hello Andres,

I'm unconvinced that this is a serious problem - typically the
overhead of WAL
volume due to pruning / freezing is due to the full page images
emitted, not
the raw size of the records. Once an FPI is emitted, this doesn't matter.

What gains have you measured in somewhat realistic workloads?

So far, we have had no tests in any real production environment.
Moreover, the load in the new test (recovery/
t/052_prune_dfor_compression.pl) is quite contrived. However, it
demonstrates a compression ratio of more than 5, and it is measured for
an overall size of all prune/freeze records with no filtering.

Further development is the implementation of compression of unsorted
sequences. This is going to allow PostgreSQL to compress also the
'frozen' and the 'redirected' offset sequences, which should result in a
greater compression ratio.

But I agree with you, Andres, we need practical results to estimate a
profit. I wish we would test it on some real load soon.

Also I hope, independently of its usage in prune/freeze records, the
DFoR itself might be used for compression sequences in other places of PG.

IMHO Andres is right. A ~170kB patch really should present some numbers
quantifying the expected benefit. It doesn't need to be a real workload
from production, but something plausible enough. Even some basic
back-of-the-envelope calculations might be enough to show the promise.

Without this, the cost/benefit is so unclear most senior contributors
will probably review something else. You need to make the case why this
is worth it.

I only quickly skimmed the patches, for exactly this reason. I'm a bit
confused why this needs to add the whole libtap thing in 0001, instead
of just testing this through the SQL interface (same as test_aio etc.).

Also, I find it somewhat unlikely we'd import a GPLv3 library like this,
even if it's just a testing framework. Even ignoring the question of
having a different license for some of the code, it'd mean maintenance
burden (maybe libtap is stable/mature, no idea). I don't see why this
would be better than "write a SQL callable test module".

regards

--
Tomas Vondra

#11Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Tomas Vondra (#10)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Tomas, Andreus, Andrey, hello!

A ~170kB patch really should present some numbers
quantifying the expected benefit. It doesn't need to be a real workload
from production, but something plausible enough. Even some basic
back-of-the-envelope calculations might be enough to show the promise.

The patch results in reduction of WAL total size by:
81% during vacuuming a table having no index,
and by 55% during vacuuming a table having an index.

The numbers are the next:

=== VACUUM (table with no index) ===
-------------------- ----------------- ----------------- -----------
DFOR off, bytes DFOR on, bytes Reduction
-------------------- ----------------- ----------------- -----------
WAL total size 6743149 1184446 82%
Prune records size 6710185 1159723 5.8x
-------------------- ----------------- ----------------- -----------

=== VACUUM (table with index) ===
-------------------- ----------------- ----------------- -----------
DFOR off, bytes DFOR on, bytes Reduction
-------------------- ----------------- ----------------- -----------
WAL total size 20394208 8907090 56%
Prune records size 6812850 1225944 5.6x
-------------------- ----------------- ----------------- -----------

The logic of the tests is based on the technique from [1]/messages/by-id/CAAKRu_ZMw6Npd_qm2KM+FwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g@mail.gmail.com and is the
next:

-- SQL
CREATE TABLE t_prune ( id int, val text )
WITH (fillfactor = 100, autovacuum_enabled = false);

INSERT INTO t_prune
SELECT g, 'x' FROM generate_series(1,3000000) g;

CREATE INDEX ON t_prune(id); -- for the test using an indexed table

DELETE FROM t_prune WHERE id % 500 <> 0;
SELECT pg_current_wal_flush_lsn(); -- get start_lsn here
VACUUM FREEZE t_prune; -- 3 times
SELECT pg_current_wal_flush_lsn(); -- get end_lsn here

# BASH
# stop cluster
pg_waldump -p $wal_dir -s $start_lsn -e $end_lsn 2>/dev/null;

The test is implemented in 052_prune_dfor_compression.pl, therefore
the presented results can be refetched by restarting this test script.

Also, I find it somewhat unlikely we'd import a GPLv3 library like
this, even if it's just a testing framework. Even ignoring the
question of having a different license for some of the code, it'd mean
maintenance burden (maybe libtap is stable/mature, no idea). I don't
see why this would be better than "write a SQL callable test module".

I am ready to rework it once there is consensus on the core of the
patch.

Best regards,
Evgeny.

P.s. rebased onto a1643d40b30.

[1]: /messages/by-id/CAAKRu_ZMw6Npd_qm2KM+FwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g@mail.gmail.com
/messages/by-id/CAAKRu_ZMw6Npd_qm2KM+FwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g@mail.gmail.com

Attachments:

v08-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v08-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2911-2
v08-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v08-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1202-2
v08-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v08-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+755-35
#12Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#11)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Hello hackers!

Dear Andres, please forgive me for the typo in your name in my previous
message. It was not intentional. My blushes!

v09 is the same as v08, except for a difference in test/dfor/Makefile.
I am still trying to fix the build errors in CI/CD.

P.s. rebased onto e0fa5bd1465.

Attachments:

v09-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v09-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2919-2
v09-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v09-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1202-2
v09-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v09-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+755-35
#13Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#12)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

The problem with GCC dependencies appearing in Debian Trixie in
test/dfor has been replayed locally and fixed.

P.S. Rebased onto 11d6042337f.

Attachments:

v10-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v10-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+2912-2
v10-0002-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v10-0002-Implement-Delta-Frame-of-Reference-compression.patchDownload+1202-2
v10-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v10-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+755-35
#14Andres Freund
andres@anarazel.de
In reply to: Evgeny Voropaev (#11)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

On 2026-04-08 20:34:42 +0800, Evgeny Voropaev wrote:

Tomas, Andreus, Andrey, hello!

A ~170kB patch really should present some numbers
quantifying the expected benefit. It doesn't need to be a real workload
from production, but something plausible enough. Even some basic
back-of-the-envelope calculations might be enough to show the promise.

The patch results in reduction of WAL total size by:
81% during vacuuming a table having no index,
and by 55% during vacuuming a table having an index.

The numbers are the next:

=== VACUUM (table with no index) ===
-------------------- ----------------- ----------------- -----------
DFOR off, bytes DFOR on, bytes Reduction
-------------------- ----------------- ----------------- -----------
WAL total size 6743149 1184446 82%
Prune records size 6710185 1159723 5.8x
-------------------- ----------------- ----------------- -----------

=== VACUUM (table with index) ===
-------------------- ----------------- ----------------- -----------
DFOR off, bytes DFOR on, bytes Reduction
-------------------- ----------------- ----------------- -----------
WAL total size 20394208 8907090 56%
Prune records size 6812850 1225944 5.6x
-------------------- ----------------- ----------------- -----------

These numbers make the impact sound bigger than I think it really is:

- They neglect that the insert generates ~183MB of WAL, the delete ~161MB
without indexes and ~243MB / 161MB with. In contrast to that 6.7Mb isn't
particularly significant.

- Workloads deleting almost all records in the table but leaving some in to
prevent truncation aren't particularly common.

- The narrowness of the rows (~30 bytes, with row header) makes the wins much
bigger than they'd be in realistic cases

- The workload doesn't involve any FPIs. It's much more common to have
vacuum's occur later and trigger FPIs.

Heh. In this case FPIs actually would *reduce* the overhead of the current
code, because the page is so empty after all the deletes that the FPI uses
less space than the update . It's 4.1MB when not using indexes and not using
wal compression and 1MB with wal compression.

Seems we could get a fair bit of benefit by just using a heuristic to switch
to an FPI when there are enough changes.

I think that'd just be a few lines.

Greetings,

Andres Freund

#15Andrey Borodin
amborodin@acm.org
In reply to: Andres Freund (#14)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

On 9 Apr 2026, at 21:50, Andres Freund <andres@anarazel.de> wrote:

These numbers make the impact sound bigger than I think it really is:

- They neglect that the insert generates ~183MB of WAL, the delete ~161MB
without indexes and ~243MB / 161MB with. In contrast to that 6.7Mb isn't
particularly significant.

Well, vacuuming of a bloated tables does happen. The amount of WAL needed to
accumulate bloat does not invalidate benefits of reducing WAL needed to vacuum
bloat.

- Workloads deleting almost all records in the table but leaving some in to
prevent truncation aren't particularly common.

Queue tables may be kind of antipattern, yet users use such tables. And sometimes
they tend to have ~90% bloat. And cause 99% of problems.

- The narrowness of the rows (~30 bytes, with row header) makes the wins much
bigger than they'd be in realistic cases

It’s crafted benchmark to demonstrate bright side. It’s also super easy to demonstrate
the case when proposed patch does not give any benefit at all.

Evgeny, do you know of any cases when the patch has negative effect?

I think if it’s strictly non-negative - then we can just weight complexity of maintaining
the proposed approach against benefits.

- The workload doesn't involve any FPIs. It's much more common to have
vacuum's occur later and trigger FPIs.

AFAIU, without special handling FPIs will not substitute xl_heap_prune, will they?

Heh. In this case FPIs actually would *reduce* the overhead of the current
code, because the page is so empty after all the deletes that the FPI uses
less space than the update . It's 4.1MB when not using indexes and not using
wal compression and 1MB with wal compression.

Seems we could get a fair bit of benefit by just using a heuristic to switch
to an FPI when there are enough changes.

I think we could even do it in a generic way: if the record body is heavier that its FPIs - just
do FPIs only.

Best regards, Andrey Borodin.

#16Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Andrey Borodin (#15)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Evgeny, do you know of any cases when the patch has negative effect?

I think if it’s strictly non-negative - then we can just weight complexity of maintaining
the proposed approach against benefits.

Andrey, during my research and testing I haven't encountered any
negative effects from this patch on the WAL size.

#17Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#16)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Continue fixing CI issues in this patch. Resolved frz_offsets
misalignment caused by odd byte counts in dead or unused tuple packs.

Patch set has been split, unit-tests have been moved into separated
patch-files. It shows that the very essence of the patch has size of
90 KB only.

P.S. Rebased onto fce3f7d2677.

Attachments:

v11-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v11-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+824-1
v11-0002-Tests-of-vect-and-uniqsortvect-containers-and-of.patchtext/x-patch; charset=UTF-8; name=v11-0002-Tests-of-vect-and-uniqsortvect-containers-and-of.patchDownload+2088-2
v11-0003-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v11-0003-Implement-Delta-Frame-of-Reference-compression.patchDownload+841-1
v11-0004-Tests-for-Delta-Frame-of-Reference-unit.patchtext/x-patch; charset=UTF-8; name=v11-0004-Tests-for-Delta-Frame-of-Reference-unit.patchDownload+376-2
v11-0005-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v11-0005-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+782-45
#18Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Evgeny Voropaev (#6)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

On 24/03/2026 16:28, Evgeny Voropaev wrote:

Also I hope, independently of its usage in prune/freeze records, the
DFoR itself might be used for compression sequences in other places of PG.

Yeah, that would make this huge amount of new code much more palatable.

I had a similar thought when I added src/backend/lib/integerset.c, I
planned to also use it for holding the dead TID list in vacuum for
starters, and possibly for more things in the future. That plan was
foiled because we got parallel VACUUM instead, which moved the TID list
to shared memory, and I didn't account for that in integerset.c. So now
integerset.c is only used for GiST vacuum, which is a pretty narrow use
case.

Can this DFoR code replace integerset.c easily? Can we use it for the
vacuum dead TID list? For GIN posting lists? Where else?

- Heikki

#19Andrey Borodin
amborodin@acm.org
In reply to: Heikki Linnakangas (#18)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

On 14 Apr 2026, at 14:02, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Can this DFoR code replace integerset.c easily?

Or, possibly, teach integerset to be serializable to WAL record or anywhere else.

Best regards, Andrey Borodin.

#20Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Heikki Linnakangas (#18)
Re: Compress prune/freeze records with Delta Frame of Reference algorithm

Hello hackers,

Can this DFoR code replace integerset.c easily? Can we use it for
the vacuum dead TID list? For GIN posting lists? Where else?

Heikki, thank you for your attention and proposals. I'm learning areas
you proposed to be developed. This took time, since I am not adept at
them. Last week I also have been developing the DFoR patch to support
unsorted sequences. That's why there was the delay in answering.

About GIN.
Since GIN exploits TIDs sequences and saves it on the disk, it can be
the most appropriate candidate to be developed with DFoR.

About the dead TID list.
If I'm not mistaken, the dead TID list exists only in RAM and never on
the disk or in the network. So, what is the advantage supposed to be
achieved due to using compression in the dead TID list?

About the GiST vacuuming and the use of integerset in it.
The integerset implements a tree in addition to compression.
DFoR now performs only compression. Moreover the size of a pack is
flexible (varying), which must become an issue for its usage in the
tree. It needs more thorough further elaboration to be developed.

So what do you think about improving GIN by means of DFOR? Should I try?

Best regards,
Evgeny Voropaev,
Tantor Labs, LLC.

P.S.
In the v12 version of the patch:
- implemented the DFOR compression for unsorted sequences;
- implemented the compression of frozen and redirected tuple offsets
in the prune/freeze WAL record
- ignored header checking of header templates from DFOR file set;
- rebased onto 9b43e6793b0.

Attachments:

v12-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchtext/x-patch; charset=UTF-8; name=v12-0001-Implement-vect-and-uniqsortvect-containers-and-b.patchDownload+810-1
v12-0002-Tests-of-vect-and-uniqsortvect-containers-and-of.patchtext/x-patch; charset=UTF-8; name=v12-0002-Tests-of-vect-and-uniqsortvect-containers-and-of.patchDownload+2088-2
v12-0003-Implement-Delta-Frame-of-Reference-compression.patchtext/x-patch; charset=UTF-8; name=v12-0003-Implement-Delta-Frame-of-Reference-compression.patchDownload+930-1
v12-0004-Tests-for-Delta-Frame-of-Reference-unit.patchtext/x-patch; charset=UTF-8; name=v12-0004-Tests-for-Delta-Frame-of-Reference-unit.patchDownload+541-2
v12-0005-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchtext/x-patch; charset=UTF-8; name=v12-0005-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patchDownload+866-60
#21Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Evgeny Voropaev (#20)
#22Evgeny Voropaev
evgeny.voropaev@tantorlabs.com
In reply to: Evgeny Voropaev (#20)