XLog size reductions: Reduced XLog record header size for PG17

Started by Matthias van de Meentalmost 3 years ago11 messageshackers
Jump to latest
#1Matthias van de Meent
boekewurm+postgres@gmail.com

Hi,

As I've mentioned earlier on-list [0]/messages/by-id/CAEze2Whf=fwAj7rosf6aDM9t+7MU1w-bJn28HFWYGkz+ics-hg@mail.gmail.com[1]/messages/by-id/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ@mail.gmail.com and off-list, I've been
working on reducing the volume of WAL that we write. This is one
intermediate step towards that effort.

Attached is a patchset that reduces the storage overhead of each WAL
record, with theoretical total savings in pgbench transactions
somewhere between 4-5%, depending on platform and the locality of
updates. These savings are achieved by reducing the amount of data
stored, and by not emitting bytes that don't carry data relevant to
each record's redo- and decode routines.

Patches contained:
0001 is copied essentially verbatim from [1]/messages/by-id/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ@mail.gmail.com and reduces overhead in
the registered block's length field where possible. It is included to
improve code commonality between varcoded integer fields. See [1]/messages/by-id/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ@mail.gmail.com for
more details.

0002 updates accesses to the rmgr-managed bits of xl_info with its own
macro returning only rmgr-managed bits, and updates XLogRecGetInfo()
to return only the xlog-managed bits.

0003 renames the rm_identify argument from 'info' to 'rmgrinfo'; and
stops providing the xlog-managed bits to the function - rmgrs have no
need to know the xlog internal info bits.

0004 continues on 0003 and moves the rmgr info bits into their own
xl_rmgrinfo of type uint8, stored in the alignment hole in the
XLogRecord struct.

0005 updates the code to only include a valid XID in the record when
the rmgr actually needs to use that XID.

0006 implements a new, variable length, WAL record header format,
previously discussed at [0]/messages/by-id/CAEze2Whf=fwAj7rosf6aDM9t+7MU1w-bJn28HFWYGkz+ics-hg@mail.gmail.com and [2]/messages/by-id/CA+Tgmoaa9Yc9O-FP4vS_xTKf8Wgy8TzHpjnjN56_ShKE=jrP-Q@mail.gmail.com. This new WAL record header is a
minimum of 14 bytes large, but generally will be 15 to 21 bytes in
size, depending on the data contained, the type of record, and whether
the record needs an XID.

Notes:
- The patchset includes [1]/messages/by-id/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ@mail.gmail.com for its variable-length encoding of
uint32, and this is included in the savings calculation.
- Not all records include the backend's XID anymore. XLog API users
must explicitly request the inclusion of XID in the record with the
XLOG_INCLUDE_XID record flag.
- XLog records are now always aligned to 8 bytes. This was needed to
reduce the complexity of var-coding the record length on 32-bit
systems. Savings on 32-bit systems still exist, but can be expected to
be less impactful.
- XLog length is now varlength encoded. No more records with <255
bytes of data storing 3 0-bytes - the length is now stored in 0, 1, 2
or 4 bytes.
- RMGRs now get their own uint8 info/flags field. No more sharing bits
with WAL infrastructure in xl_info. The byte is only stored if it is
non-0, and otherwise omitted (indicated by flag bits in xl_info).

Todo:
- Check for any needed documentation / code comments updates
- benchmark this

Future work:
- we could omit XLR_BLOCK_ID_DATA_[LONG,SHORT] if it is the only
"block ID" in the record (such as checkpoint records, commit/rollback
records, etc.). This would be indicated by a xl_info bit, and this
would save 2-5 bytes per applicable record.
- This patch inherits [1]/messages/by-id/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ@mail.gmail.com's property in which we can release the
BKPBLOCK_HAS_DATA flag bit (its value is already implied by
XLR_BLOCKID_SZCLASS), allowing us to use it for something else, like
indicating varcoded RelFileLocator/BlockId.

Kind regards,

Matthias van de Meent
Neon, Inc.

[0]: /messages/by-id/CAEze2Whf=fwAj7rosf6aDM9t+7MU1w-bJn28HFWYGkz+ics-hg@mail.gmail.com
[1]: /messages/by-id/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ@mail.gmail.com
[2]: /messages/by-id/CA+Tgmoaa9Yc9O-FP4vS_xTKf8Wgy8TzHpjnjN56_ShKE=jrP-Q@mail.gmail.com

Attachments:

v1-0006-Reformat-the-XLog-record-header.patchapplication/octet-stream; name=v1-0006-Reformat-the-XLog-record-header.patchDownload+480-156
v1-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchapplication/octet-stream; name=v1-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchDownload+188-17
v1-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchapplication/octet-stream; name=v1-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchDownload+89-85
v1-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchapplication/octet-stream; name=v1-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchDownload+31-14
v1-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchapplication/octet-stream; name=v1-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchDownload+56-57
v1-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchapplication/octet-stream; name=v1-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchDownload+105-103
#2Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Matthias van de Meent (#1)
Re: XLog size reductions: Reduced XLog record header size for PG17

Hi,

The attached v2 patchset contains some small fixes for the failing
cfbot 32-bit tests - at least locally it does so.

I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
patch of the set, which has now been updated to XLP_ALIGN as well.
Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
systems.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

Attachments:

v2-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchapplication/octet-stream; name=v2-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchDownload+89-85
v2-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchapplication/octet-stream; name=v2-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchDownload+56-57
v2-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchapplication/octet-stream; name=v2-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchDownload+188-17
v2-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchapplication/octet-stream; name=v2-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchDownload+105-103
v2-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchapplication/octet-stream; name=v2-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchDownload+31-14
v2-0006-Reformat-the-XLog-record-header.patchapplication/octet-stream; name=v2-0006-Reformat-the-XLog-record-header.patchDownload+483-159
#3Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Matthias van de Meent (#2)
Re: XLog size reductions: Reduced XLog record header size for PG17

On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Hi,

The attached v2 patchset contains some small fixes for the failing
cfbot 32-bit tests - at least locally it does so.

I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
patch of the set, which has now been updated to XLP_ALIGN as well.
Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
systems.

Apparently there was some usage of MAXALIGN() in xlogreader that I'd
missed, and which only shows up in TAP tests. In v3 I've fixed that,
together with some improved early detection of invalid record headers.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

Attachments:

v3-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchapplication/octet-stream; name=v3-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchDownload+31-14
v3-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchapplication/octet-stream; name=v3-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchDownload+188-17
v3-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchapplication/octet-stream; name=v3-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchDownload+105-103
v3-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchapplication/octet-stream; name=v3-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchDownload+56-57
v3-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchapplication/octet-stream; name=v3-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchDownload+89-85
v3-0006-Reformat-the-XLog-record-header.patchapplication/octet-stream; name=v3-0006-Reformat-the-XLog-record-header.patchDownload+502-165
#4Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Matthias van de Meent (#3)
Re: XLog size reductions: Reduced XLog record header size for PG17

On Mon, 3 Jul 2023 at 13:08, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Hi,

The attached v2 patchset contains some small fixes for the failing
cfbot 32-bit tests - at least locally it does so.

I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
patch of the set, which has now been updated to XLP_ALIGN as well.
Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
systems.

Apparently there was some usage of MAXALIGN() in xlogreader that I'd
missed, and which only shows up in TAP tests. In v3 I've fixed that,
together with some improved early detection of invalid record headers.

Another fix for CFBot - pg_waldump tests which were added in 96063e28
exposed an issue in my patchset related to RM_INVALID_ID.

v4 splits former patch 0006 into two: now 0006 adds RM_INVALID and
does the rmgr-related changes in the code, and 0007 does the WAL disk
format overhaul.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

Attachments:

v4-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchapplication/octet-stream; name=v4-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchDownload+56-57
v4-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchapplication/octet-stream; name=v4-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchDownload+31-14
v4-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchapplication/octet-stream; name=v4-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchDownload+89-85
v4-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchapplication/octet-stream; name=v4-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchDownload+105-103
v4-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchapplication/octet-stream; name=v4-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchDownload+188-17
v4-0006-Add-RM_INVALID-an-invalid-resource-manager.patchapplication/octet-stream; name=v4-0006-Add-RM_INVALID-an-invalid-resource-manager.patchDownload+18-5
v4-0007-Reformat-the-XLog-record-header.patchapplication/octet-stream; name=v4-0007-Reformat-the-XLog-record-header.patchDownload+486-163
#5Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Matthias van de Meent (#4)
Re: XLog size reductions: Reduced XLog record header size for PG17

On Wed, 12 Jul 2023 at 14:50, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Mon, 3 Jul 2023 at 13:08, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Hi,

The attached v2 patchset contains some small fixes for the failing
cfbot 32-bit tests - at least locally it does so.

I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
patch of the set, which has now been updated to XLP_ALIGN as well.
Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
systems.

Apparently there was some usage of MAXALIGN() in xlogreader that I'd
missed, and which only shows up in TAP tests. In v3 I've fixed that,
together with some improved early detection of invalid record headers.

Another fix for CFBot - pg_waldump tests which were added in 96063e28
exposed an issue in my patchset related to RM_INVALID_ID.

v4 splits former patch 0006 into two: now 0006 adds RM_INVALID and
does the rmgr-related changes in the code, and 0007 does the WAL disk
format overhaul.

V5 is a rebased version of v4, and includes the latest patch from
"smaller XLRec block header" [0]/messages/by-id/CAEze2WhG_qvs0_HPCKyGLjFSSeiLZJcFhT=rzEUd7AzyxnSfKw@mail.gmail.com as 0001.

Kind regards,

Matthias van de Meent

[0]: /messages/by-id/CAEze2WhG_qvs0_HPCKyGLjFSSeiLZJcFhT=rzEUd7AzyxnSfKw@mail.gmail.com

Attachments:

v5-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchapplication/octet-stream; name=v5-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patchDownload+56-57
v5-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchapplication/octet-stream; name=v5-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patchDownload+89-85
v5-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchapplication/octet-stream; name=v5-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patchDownload+105-103
v5-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchapplication/octet-stream; name=v5-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patchDownload+204-17
v5-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchapplication/octet-stream; name=v5-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patchDownload+31-14
v5-0006-Add-RM_INVALID-an-invalid-resource-manager.patchapplication/octet-stream; name=v5-0006-Add-RM_INVALID-an-invalid-resource-manager.patchDownload+18-5
v5-0007-Reformat-the-XLog-record-header.patchapplication/octet-stream; name=v5-0007-Reformat-the-XLog-record-header.patchDownload+486-163
#6Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#5)
Re: XLog size reductions: Reduced XLog record header size for PG17

On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote:

V5 is a rebased version of v4, and includes the latest patch from
"smaller XLRec block header" [0] as 0001.

0001 and 0007 are the meat of the changes.

-#define XLR_CHECK_CONSISTENCY  0x02
+#define XLR_CHECK_CONSISTENCY  (0x20)

I can't help but notice that there are a few stylistic choices like
this one that are part of the patch. Using parenthesis in the case of
hexa values is inconsistent with the usual practices I've seen in the
tree.

 #define COPY_HEADER_FIELD(_dst, _size)            \
     do {                                        \
-        if (remaining < _size)                    \
+        if (remaining < (_size))                \
             goto shortdata_err;                    \

There are a couple of stylistic changes like this one, that I guess
could just use their own patch to make these macros easier to use.

-#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
+#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & XLR_INFO_MASK)
+#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & XLR_RMGR_INFO_MASK)

This stuff in 0002 is independent of 0001, am I right? Doing this
split with an extra macro is okay by me, reducing the presence of
XLR_INFO_MASK and bitwise operations based on it.

0003 is also mechanical, but if you begin to enforce the use of
XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR
identity callback, we should have at least a validity check to make
sure that nothing, even custom RMGRs, pass down unexpected bits?

I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and
I fear that people are going to forget to set it. Wouldn't it be
better to use an option where the XID is excluded instead, making the
inclusing the an XID the default?

The resource manager has ID = 0, thus requiring some special
handling in other code. Apart from being generally useful, it is
used in future patches to detect the end of wal in lieu of a zero-ed
fixed-size xl_tot_len field.

Err, no, that may not be true. See for example this thread where the
topic of improving the checks of xl_tot_len and rely on this value on
when a record header has been validated, even across page borders:
/messages/by-id/17928-aa92416a70ff44a2@postgresql.org

Except that, in which cases could an invalid RMGR be useful?
--
Michael

#7Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#6)
Re: XLog size reductions: Reduced XLog record header size for PG17

On Wed, 20 Sept 2023 at 07:06, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote:

V5 is a rebased version of v4, and includes the latest patch from
"smaller XLRec block header" [0] as 0001.

0001 and 0007 are the meat of the changes.

Correct.

-#define XLR_CHECK_CONSISTENCY  0x02
+#define XLR_CHECK_CONSISTENCY  (0x20)

I can't help but notice that there are a few stylistic choices like
this one that are part of the patch. Using parenthesis in the case of
hexa values is inconsistent with the usual practices I've seen in the
tree.

Yes, I'll take another look at that.

#define COPY_HEADER_FIELD(_dst, _size)            \
do {                                        \
-        if (remaining < _size)                    \
+        if (remaining < (_size))                \
goto shortdata_err;                    \

There are a couple of stylistic changes like this one, that I guess
could just use their own patch to make these macros easier to use.

They actually fix complaints of my IDE, but are otherwise indeed stylistic.

-#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
+#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & XLR_INFO_MASK)
+#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & XLR_RMGR_INFO_MASK)

This stuff in 0002 is independent of 0001, am I right? Doing this
split with an extra macro is okay by me, reducing the presence of
XLR_INFO_MASK and bitwise operations based on it.

Yes, that change is to stop making use of (~XLR_INFO_MASK) where
XLR_RMGR_INFO_MASK is the correct bitmask (whilst also being quite
useful in the later patch).

0003 is also mechanical, but if you begin to enforce the use of
XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR
identity callback, we should have at least a validity check to make
sure that nothing, even custom RMGRs, pass down unexpected bits?

I think that's already handled in XLogInsert(), but I'll make sure to
add more checks if they're not in place yet.

I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and
I fear that people are going to forget to set it. Wouldn't it be
better to use an option where the XID is excluded instead, making the
inclusing the an XID the default?

Most rmgrs don't actually use the XID. Only XACT, MULTIXACT, HEAP,
HEAP2, and LOGICALMSG use the xid, so I thought it would be easier to
just find the places where those RMGR's records were being logged than
to update all other places.

I don't mind changing how we decide to log the XID, but I don't think
EXCLUDE_XID is a good alternative: most records just don't need the
transaction ID. There are many more index AMs with logging than table
AMs, so I don't think it is that weird to default to 'not included'.

The resource manager has ID = 0, thus requiring some special
handling in other code. Apart from being generally useful, it is
used in future patches to detect the end of wal in lieu of a zero-ed
fixed-size xl_tot_len field.

Err, no, that may not be true. See for example this thread where the
topic of improving the checks of xl_tot_len and rely on this value on
when a record header has been validated, even across page borders:
/messages/by-id/17928-aa92416a70ff44a2@postgresql.org

Yes, there are indeed exceptions when reusing WAL segments, but it's
still a good canary, like xl_tot_len before this patch.

Except that, in which cases could an invalid RMGR be useful?

A sentinel value that is obviously invalid is available for several
types, e.g. BlockNumber, TransactionId, XLogRecPtr, Buffer, and this
is quite useful if you want to check if something is definitely
invalid. I think that's fine in principle, we're already "wasting"
some IDs in the gap between RM_MAX_BUILTIN_ID and RM_MIN_CUSTOM_ID.

In the current xlog infrastructure, we use xl_tot_len as that sentinel
to detect whether a new record may exist, but in this patch that can't
be used because the field may not exist and depends on other bytes. So
I used xl_rmgr_id as the field to base the 'may a next record exist'
checks on, which required the 0 rmgr ID to be invalid.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#8Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#7)
Re: XLog size reductions: Reduced XLog record header size for PG17

On Mon, Sep 25, 2023 at 07:40:00PM +0200, Matthias van de Meent wrote:

On Wed, 20 Sept 2023 at 07:06, Michael Paquier <michael@paquier.xyz> wrote:

#define COPY_HEADER_FIELD(_dst, _size)            \
do {                                        \
-        if (remaining < _size)                    \
+        if (remaining < (_size))                \
goto shortdata_err;                    \

There are a couple of stylistic changes like this one, that I guess
could just use their own patch to make these macros easier to use.

They actually fix complaints of my IDE, but are otherwise indeed stylistic.

Oh, OK. I just use an old-school terminal, but no objections in
changing these if they make life easier for some hackers. Still, that
feels independant of what you are proposing here.
--
Michael

#9Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Michael Paquier (#8)
Re: XLog size reductions: Reduced XLog record header size for PG17

Hi and Happy New Year!

I've looked through the patches and the change seems quite small and
justified. But at the second round, some doubt arises on whether this long
patchset indeed introduces enough performance gain? I may be wrong, but it
saves only several bytes and the performance gain would be only in some
specific artificial workload. Did you do some measurements? Do we have
several percent performance-wise?

Kind regards,
Pavel Borisov

#10Andrey Borodin
amborodin@acm.org
In reply to: Pavel Borisov (#9)
Re: XLog size reductions: Reduced XLog record header size for PG17

On 3 Jan 2024, at 15:15, Pavel Borisov <pashkin.elfe@gmail.com> wrote:

Hi and Happy New Year!

I've looked through the patches and the change seems quite small and justified. But at the second round, some doubt arises on whether this long patchset indeed introduces enough performance gain? I may be wrong, but it saves only several bytes and the performance gain would be only in some specific artificial workload. Did you do some measurements? Do we have several percent performance-wise?

Kind regards,
Pavel Borisov

Hi Matthias!

This is a kind reminder that the thread is waiting for your reply. Are you interesting in CF entry [0]https://commitfest.postgresql.org/47/4386/?

Thanks!

Best regards, Andrey Borodin.
[0]: https://commitfest.postgresql.org/47/4386/

#11Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Matthias van de Meent (#1)
Re: XLog size reductions: Reduced XLog record header size for PG17

On Jun 20, 2023, at 1:01 PM, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:

0001 is copied essentially verbatim from [1] and reduces overhead in
the registered block's length field where possible. It is included to
improve code commonality between varcoded integer fields. See [1] for
more details.

Hi Matthias! I am interested in seeing this patch move forward. We seem to be stuck.

The disagreement on the other thread seems to be about whether we can generalize and reuse variable integer encoding. Could you comment on whether perhaps we just need a few versions of that? Perhaps one version where the number of length bytes is encoded in the length itself (such as is used for varlena and by Andres' patch) and one where the number of length bytes is stored elsewhere? You are clearly using the "elsewhere" form, but perhaps you could pull out the logic of that into src/common? In struct XLogRecordBlockHeader.id <http://xlogrecordblockheader.id/&gt;, you are reserving two bits for the size class. (The code comments aren't clear about this, by the way.) Perhaps if the generalized length encoding logic could take a couple arguments to represent where and how the size class bits are to be stored, and where the length itself is stored? I doubt you need to sacrifice any performance gains of this patch to make that happen. You'd just need to restructure the patch.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company