SLRUs in the main buffer pool - Page Header definitions

Started by Bagga, Rishualmost 4 years ago37 messageshackers
Jump to latest
#1Bagga, Rishu
bagrishu@amazon.com

Hi all,

PostgreSQL currently maintains several data structures in the SLRU
cache. The SLRU cache has scaling and sizing challenges because of it’s
simple implementation. The goal is to move these caches to the common
buffer cache to benefit from the stronger capabilities of the common
buffercache code. At AWS, we are building on the patch shared by Thomas
Munro [1] /messages/by-id/CA+hUKGKAYze99B-jk9NoMp-2BDqAgiRC4oJv+bFxghNgdieq8Q@mail.gmail.com, which treats the SLRU pages as part of a pseudo-databatabe
of ID 9. We will refer to the pages belonging to SLRU components as
BufferedObject pages going forward.

The current SLRU pages do not have any header, so there is a need to
create a new page header format for these. Our investigations revealed
that we need to:

1. track LSN to ensure durability and consistency of all pages (for redo
   and full page write purposes)
2. have a checksum (for page correctness verification).
3. A flag to identify if the page is a relational or BufferedObject
4. Track version information.

We are suggesting a minimal BufferedObject page header
to be the following, overlapping with the key fields near the beginning
of the regular PageHeaderData:

typedef struct BufferedObjectPageHeaderData
{
    PageXLogRecPtr pd_lsn;
    uint16_t       pd_checksum;
    uint16_t       pd_flags;
    uint16_t       pd_pagesize_version;
} BufferedObjectPageHeaderData;

For reference, the regular page header looks like the following:
typedef struct PageHeaderData
{
    PageXLogRecPtr    pd_lsn;
    uint16_t    pd_checksum;
    uint16_t    pd_flags;
    LocationIndex   pd_lower;
    LocationIndex   pd_upper;
    LocationIndex   pd_special;
    uint16_t           pd_pagesize_version;
    TransactionId   pd_prune_xid;
    ItemIdDataCommon  pd_linp[];
} PageHeaderData;

After careful review, we have trimmed out the heap and index specific
fields from the suggested header that do not add any value to SLRU
components. We plan to use pd_lsn, pd_checksum, and pd_pagesize_version
in the same way that they are in relational pages. These fields are
needed to ensure consistency, durability and page correctness.

We will use the 4th bit of pd_flags to identify a BufferedObject page.
If the bit is set then this denotes a BufferedObject page. Today, bits
1 - 3 are used for determining if there are any free line pointers, if
the page is full, and if all tuples on the page are visible to
everyone, respectively. We will use this information accordingly in the
storage manager to determine which callback functions to use for file
I/O operations. This approach allows the buffercache to have an
universal method to quickly determine what type of page it is dealing
with at any time.

Using the new BufferedObject page header will be space efficient but
introduces a significant change in the codebase to now track two types
of page header data. During upgrade, all SLRU files that exist on the
system must be converted to the new format with page header. This will
require rewriting all the SLRU pages with the page header as part of
pg_upgrade.

We believe that this is the correct approach for the long run. We would
love feedback if there are additional items of data that should be
tracked as well. Alternatively, we could re-use the existing page
header and the unused fields could be used as a padding. This feels
like an unclean approach but would avoid having two page header types
in the database.

[1]:  /messages/by-id/CA+hUKGKAYze99B-jk9NoMp-2BDqAgiRC4oJv+bFxghNgdieq8Q@mail.gmail.com

Discussed with: Joe Conway, Nathan Bossart, Shawn Debnath

Rishu Bagga

Amazon Web Services (AWS)

#2Andres Freund
andres@anarazel.de
In reply to: Bagga, Rishu (#1)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

On 2022-06-22 21:06:29 +0000, Bagga, Rishu wrote:

3. A flag to identify if the page is a relational or BufferedObject

Why is this needed in the page header?

Using the new BufferedObject page header will be space efficient but
introduces a significant change in the codebase to now track two types
of page header data. During upgrade, all SLRU files that exist on the
system must be converted to the new format with page header. This will
require rewriting all the SLRU pages with the page header as part of
pg_upgrade.

How are you proposing to deal with this in the "key" to "offset in SLRU"
mapping? E.g. converting a xid to an offset in the pg_xact SLRU. I assume
you're thinking to deal with this by making the conversion math a bit more
complicated?

Greetings,

Andres Freund

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bagga, Rishu (#1)
Re: SLRUs in the main buffer pool - Page Header definitions

"Bagga, Rishu" <bagrishu@amazon.com> writes:

The current SLRU pages do not have any header, so there is a need to
create a new page header format for these. Our investigations revealed
that we need to:

1. track LSN to ensure durability and consistency of all pages (for redo
   and full page write purposes)
2. have a checksum (for page correctness verification).
3. A flag to identify if the page is a relational or BufferedObject
4. Track version information.

Isn't this a nonstarter from the standpoint of pg_upgrade?

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

On 2022-06-22 19:12:14 -0400, Tom Lane wrote:

"Bagga, Rishu" <bagrishu@amazon.com> writes:

The current SLRU pages do not have any header, so there is a need to
create a new page header format for these. Our investigations revealed
that we need to:

1. track LSN to ensure durability and consistency of all pages (for redo
�� and full page write purposes)
2. have a checksum (for page correctness verification).
3. A flag to identify if the page is a relational or BufferedObject
4. Track version information.

Isn't this a nonstarter from the standpoint of pg_upgrade?

We're rewriting some relation forks as part of pg_upgrade (visibility map
IIRC?), so rewriting an SLRU is likely not prohibitive - there's much more of
a limit to the SLRU sizes than the number and aggregate size of relation
forks.

Greetings,

Andres Freund

#5Bagga, Rishu
bagrishu@amazon.com
In reply to: Andres Freund (#2)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi Andres,
Thanks for your response.

To answer your questions:

3. A flag to identify if the page is a relational or BufferedObject

Why is this needed in the page header?

Now that we are dealing with two different type of page headers, we need to know how to interpret any given page. We need to use pd_flags to determine this.

How are you proposing to deal with this in the "key" to "offset in >SLRU"
mapping? E.g. converting a xid to an offset in the pg_xact SLRU. I >assume
you're thinking to deal with this by making the conversion math a bit >more
complicated?

You’re right; we would have to account for this in the conversion math between the ‘key’ and ‘offset’. The change to the macros would be as following:

#define MULTIXACT_OFFSETS_PER_PAGE ((BLCKSZ - SizeOfBufferedObjectPageHeaderData) / sizeof(MultiXactOffset))

Additionally, we need to account for the size of the page header when reading and writing multixacts in memory based off of the entryno.

Rishu Bagga

Amazon Web Services (AWS)

On 6/22/22, 2:40 PM, "Andres Freund" <andres@anarazel.de> wrote:

Hi,

On 2022-06-22 21:06:29 +0000, Bagga, Rishu wrote:

3. A flag to identify if the page is a relational or BufferedObject

Why is this needed in the page header?

Using the new BufferedObject page header will be space efficient but
introduces a significant change in the codebase to now track two types
of page header data. During upgrade, all SLRU files that exist on the
system must be converted to the new format with page header. This will
require rewriting all the SLRU pages with the page header as part of
pg_upgrade.

How are you proposing to deal with this in the "key" to "offset in SLRU"
mapping? E.g. converting a xid to an offset in the pg_xact SLRU. I assume
you're thinking to deal with this by making the conversion math a bit more
complicated?

Greetings,

Andres Freund

#6Robert Haas
robertmhaas@gmail.com
In reply to: Bagga, Rishu (#1)
Re: SLRUs in the main buffer pool - Page Header definitions

On Wed, Jun 22, 2022 at 5:06 PM Bagga, Rishu <bagrishu@amazon.com> wrote:

We are suggesting a minimal BufferedObject page header
to be the following, overlapping with the key fields near the beginning
of the regular PageHeaderData:

typedef struct BufferedObjectPageHeaderData
{
PageXLogRecPtr pd_lsn;
uint16_t pd_checksum;
uint16_t pd_flags;
uint16_t pd_pagesize_version;
} BufferedObjectPageHeaderData;

For reference, the regular page header looks like the following:
typedef struct PageHeaderData
{
PageXLogRecPtr pd_lsn;
uint16_t pd_checksum;
uint16_t pd_flags;
LocationIndex pd_lower;
LocationIndex pd_upper;
LocationIndex pd_special;
uint16_t pd_pagesize_version;
TransactionId pd_prune_xid;
ItemIdDataCommon pd_linp[];
} PageHeaderData;

After careful review, we have trimmed out the heap and index specific
fields from the suggested header that do not add any value to SLRU
components. We plan to use pd_lsn, pd_checksum, and pd_pagesize_version
in the same way that they are in relational pages. These fields are
needed to ensure consistency, durability and page correctness

I think that it's not worth introducing a new page header format to
save 10 bytes per page. Keeping things on the same format is likely to
save more than the minor waste of space costs.

--
Robert Haas
EDB: http://www.enterprisedb.com

#7Andres Freund
andres@anarazel.de
In reply to: Bagga, Rishu (#5)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

On 2022-06-23 20:25:21 +0000, Bagga, Rishu wrote:

3. A flag to identify if the page is a relational or BufferedObject

Why is this needed in the page header?

Now that we are dealing with two different type of page headers, we need to
know how to interpret any given page. We need to use pd_flags to determine
this.

When do we need to do so? We should never need to figure out whether an
on-disk block is for an SLRU or something else, without also knowing which
relation / SLRU it is in.

Greetings,

Andres Freund

#8Bagga, Rishu
bagrishu@amazon.com
In reply to: Andres Freund (#7)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi Andres,

When do we need to do so? We should never need to figure out whether an
on-disk block is for an SLRU or something else, without also knowing >which
relation / SLRU it is in.

You are correct that we wouldn’t need to rely on the pd_flag bit to determine page type for any access to a page where we come top down following the hierarchy. However, for the purpose of debugging “from the bottom up” it would be critical to know what type of page is being read in a system with multiple page header types.

On 6/23/22, 2:22 PM, "Andres Freund" <andres@anarazel.de> wrote:

Hi,

On 2022-06-23 20:25:21 +0000, Bagga, Rishu wrote:

3. A flag to identify if the page is a relational or BufferedObject

Why is this needed in the page header?

Now that we are dealing with two different type of page headers, we need to
know how to interpret any given page. We need to use pd_flags to determine
this.

When do we need to do so? We should never need to figure out whether an
on-disk block is for an SLRU or something else, without also knowing which
relation / SLRU it is in.

Greetings,

Andres Freund

#9Andres Freund
andres@anarazel.de
In reply to: Bagga, Rishu (#8)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

On 2022-06-24 00:39:41 +0000, Bagga, Rishu wrote:

When do we need to do so? We should never need to figure out whether an
on-disk block is for an SLRU or something else, without also knowing >which
relation / SLRU it is in.

You are correct that we wouldn’t need to rely on the pd_flag bit to
determine page type for any access to a page where we come top down
following the hierarchy. However, for the purpose of debugging “from the
bottom up” it would be critical to know what type of page is being read in a
system with multiple page header types.

That doesn't seem to justify using a bit on the page. Wouldn't it suffice to
add such information to the BufferDesc?

Greetings,

Andres Freund

#10Shawn Debnath
clocksweep@gmail.com
In reply to: Andres Freund (#9)
Re: SLRUs in the main buffer pool - Page Header definitions

On Thu, Jun 23, 2022 at 06:06:48PM -0700, Andres Freund wrote:

You are correct that we wouldn’t need to rely on the pd_flag bit to
determine page type for any access to a page where we come top down
following the hierarchy. However, for the purpose of debugging “from the
bottom up” it would be critical to know what type of page is being read in a
system with multiple page header types.

That doesn't seem to justify using a bit on the page. Wouldn't it suffice to
add such information to the BufferDesc?

The goal for the bit in pd_flags is to help identify what page header
should be present if one were to be looking at the physical page outside
of the database. One example that comes to mind is pg_upgrade. There
are other use cases where physical consistency checks could be applied,
again outside of a running database.

On Thu, Jun 23, 2022 at 04:27:33PM -0400, Robert Haas wrote:

I think that it's not worth introducing a new page header format to
save 10 bytes per page. Keeping things on the same format is likely to
save more than the minor waste of space costs.

Yeah, I think we are open to both approaches, though we believe it would
be cleaner to get started with a targeted page header for the new code.
But do understand having to understand/translate/deal with two page
header types might not be worth the savings in space.

If we stick with the current page header, of course, changes to pd_flag
won't be necessary anymore.

Stepping back, it seems like folks are okay with introducing a page
header to current SLRU components and that we are leaning towards
sticking with the default one for now. We can proceed with this
approach, and if needed, change it later if more folks chime in.

Cheers.

--
Shawn Debnath
Amazon Web Services (AWS)

#11Andres Freund
andres@anarazel.de
In reply to: Shawn Debnath (#10)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

On 2022-06-24 22:19:33 +0000, Shawn Debnath wrote:

On Thu, Jun 23, 2022 at 06:06:48PM -0700, Andres Freund wrote:

You are correct that we wouldn’t need to rely on the pd_flag bit to
determine page type for any access to a page where we come top down
following the hierarchy. However, for the purpose of debugging “from the
bottom up” it would be critical to know what type of page is being read in a
system with multiple page header types.

That doesn't seem to justify using a bit on the page. Wouldn't it suffice to
add such information to the BufferDesc?

The goal for the bit in pd_flags is to help identify what page header
should be present if one were to be looking at the physical page outside
of the database. One example that comes to mind is pg_upgrade. There
are other use cases where physical consistency checks could be applied,
again outside of a running database.

Outside the database you'll know the path to the file, which will tell you
it's not another kind of relation.

This really makes no sense to me. We don't have page flags indicating whether
a page is a heap, btree, visibility, fms whatever kind of page either. On a
green field, it'd make sense to have such information in a metapage at the
start of each relation - but not on each page.

On Thu, Jun 23, 2022 at 04:27:33PM -0400, Robert Haas wrote:

I think that it's not worth introducing a new page header format to
save 10 bytes per page. Keeping things on the same format is likely to
save more than the minor waste of space costs.

Yeah, I think we are open to both approaches, though we believe it would
be cleaner to get started with a targeted page header for the new code.
But do understand having to understand/translate/deal with two page
header types might not be worth the savings in space.

Not sure if that changes anything, but it's maybe worth noting that we already
have some types of pages that don't use the full page header (at least
freespace.c, c.f. buffer_std argument to MarkBufferDirtyHint()). I can see an
argument for shrinking the "uniform" part of the page header, and pushing more
things down into AMs. But I think that'd need to change the existing code, not
just introduce something new for new code.

Stepping back, it seems like folks are okay with introducing a page
header to current SLRU components and that we are leaning towards
sticking with the default one for now. We can proceed with this
approach, and if needed, change it later if more folks chime in.

I think we're clearly going to have to do that at some point not too far
away. There's just too many capabilities that are made harder by not having
that information for SLRU pages. That's not to say that it's a prerequisite to
moving SLRUs into the buffer pool (using a hack like Thomas did until the page
header is introduced). Both are complicated enough projects on their own. I
also could see adding the page header before moving SLRUs in the buffer pool,
there isn't really a hard dependency.

Greetings,

Andres Freund

#12Shawn Debnath
clocksweep@gmail.com
In reply to: Andres Freund (#11)
Re: SLRUs in the main buffer pool - Page Header definitions

On Fri, Jun 24, 2022 at 03:45:34PM -0700, Andres Freund wrote:

Outside the database you'll know the path to the file, which will tell you
it's not another kind of relation.

This really makes no sense to me. We don't have page flags indicating whether
a page is a heap, btree, visibility, fms whatever kind of page either. On a
green field, it'd make sense to have such information in a metapage at the
start of each relation - but not on each page.

Alright, I agree with the metapage having the necessary info. In
this case, we can rely on the hierarchy to determine the type of header.
Given we do not have an usecase requiring the flag, we should not
consume it at this point.

On Thu, Jun 23, 2022 at 04:27:33PM -0400, Robert Haas wrote:

I think that it's not worth introducing a new page header format to
save 10 bytes per page. Keeping things on the same format is likely to
save more than the minor waste of space costs.

Yeah, I think we are open to both approaches, though we believe it would
be cleaner to get started with a targeted page header for the new code.
But do understand having to understand/translate/deal with two page
header types might not be worth the savings in space.

Not sure if that changes anything, but it's maybe worth noting that we already
have some types of pages that don't use the full page header (at least
freespace.c, c.f. buffer_std argument to MarkBufferDirtyHint()). I can see an
argument for shrinking the "uniform" part of the page header, and pushing more
things down into AMs. But I think that'd need to change the existing code, not
just introduce something new for new code.

We did think through a universal page header concept that included just
the pd_lsn, pd_checksum, pd_flags and pulling in pd_pagesize_version and other
fields as the non-uniform members for SLRU. Unfortunately, there is a gap of
48 bits after pd_flags which makes it challenging with today's header. I am
leaning towards the standard page header for now and revisiting the universal/uniform
page header and AM changes in a separate effort. The push down to AM is an
interesting concept and should be worthwhile following up on.

Stepping back, it seems like folks are okay with introducing a page
header to current SLRU components and that we are leaning towards
sticking with the default one for now. We can proceed with this
approach, and if needed, change it later if more folks chime in.

I think we're clearly going to have to do that at some point not too far
away. There's just too many capabilities that are made harder by not having
that information for SLRU pages. That's not to say that it's a prerequisite to
moving SLRUs into the buffer pool (using a hack like Thomas did until the page
header is introduced). Both are complicated enough projects on their own. I
also could see adding the page header before moving SLRUs in the buffer pool,
there isn't really a hard dependency.

To be honest, given the nature of changes, I would prefer to have it
done in one major version release than have it be split into multiple
efforts. The value add really comes in from the consistency checks that
can be done and which are non-existent for SLRU today.

#13Bagga, Rishu
bagrishu@amazon.com
In reply to: Bagga, Rishu (#1)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

We have been working on adding page headers to the SLRU pages, as part of the migration for SLRU to buffer cache. We’ve incorporated Thomas Munro’s patch and Heikki’s Storage manager changes[1]/messages/by-id/CA+hUKGKAYze99B-jk9NoMp-2BDqAgiRC4oJv+bFxghNgdieq8Q@mail.gmail.com and have a patch for early feedback.

As part of our changes we have:

1. Added page headers to the following

*Commit_TS
*CLOG
*MultiXact
*Subtrans
*Serial (predicate.c)
*Notify (async.c)

For commit_ts, clog and MultiXact, the PageXLogRecPtr field is populated with the LSN returned during the creation of a new page; as there is no WAL record for the rest, PageXLogRecPtr is set to “InvalidXlogRecPtr”.

There is one failing assert in predicate.c for SerialPagePrecedes with the page header changes; we are looking into this issue.

The page_version is set to PG_METAPAGE_LAYOUT_VERSION (which is 1)

2. Change block number passed into ReadSlruBuffer from relative to absolute, and account for SLRU’s 256kb segment size in md.c.

The changes pass the regression tests. We are still working on handling the upgrade scenario and should have a patch out for that soon.

Attached is the patch with all changes (Heikki and Munro’s patch and page headers) consolidated

Thanks,
Rishu Bagga, Amazon Web Services (AWS)

[1]: /messages/by-id/CA+hUKGKAYze99B-jk9NoMp-2BDqAgiRC4oJv+bFxghNgdieq8Q@mail.gmail.com

[2]: /messages/by-id/CA+hUKG+02ZF-vjtUG4pH8bx+2Dn=eMh8GsT6jasiXZPgVxUXLw@mail.gmail.com

On 9/27/22, 6:54 PM, "Bagga, Rishu" <bagrishu@amazon.com> wrote:

Hi all,

PostgreSQL currently maintains several data structures in the SLRU
cache. The SLRU cache has scaling and sizing challenges because of it’s
simple implementation. The goal is to move these caches to the common
buffer cache to benefit from the stronger capabilities of the common
buffercache code. At AWS, we are building on the patch shared by Thomas
Munro [1]/messages/by-id/CA+hUKGKAYze99B-jk9NoMp-2BDqAgiRC4oJv+bFxghNgdieq8Q@mail.gmail.com, which treats the SLRU pages as part of a pseudo-databatabe
of ID 9. We will refer to the pages belonging to SLRU components as
BufferedObject pages going forward.

The current SLRU pages do not have any header, so there is a need to
create a new page header format for these. Our investigations revealed
that we need to:

1. track LSN to ensure durability and consistency of all pages (for redo
and full page write purposes)
2. have a checksum (for page correctness verification).
3. A flag to identify if the page is a relational or BufferedObject
4. Track version information.

We are suggesting a minimal BufferedObject page header
to be the following, overlapping with the key fields near the beginning
of the regular PageHeaderData:

typedef struct BufferedObjectPageHeaderData
{
PageXLogRecPtr pd_lsn;
uint16_t pd_checksum;
uint16_t pd_flags;
uint16_t pd_pagesize_version;
} BufferedObjectPageHeaderData;

For reference, the regular page header looks like the following:
typedef struct PageHeaderData
{
PageXLogRecPtr pd_lsn;
uint16_t pd_checksum;
uint16_t pd_flags;
LocationIndex pd_lower;
LocationIndex pd_upper;
LocationIndex pd_special;
uint16_t pd_pagesize_version;
TransactionId pd_prune_xid;
ItemIdDataCommon pd_linp[];
} PageHeaderData;

After careful review, we have trimmed out the heap and index specific
fields from the suggested header that do not add any value to SLRU
components. We plan to use pd_lsn, pd_checksum, and pd_pagesize_version
in the same way that they are in relational pages. These fields are
needed to ensure consistency, durability and page correctness.

We will use the 4th bit of pd_flags to identify a BufferedObject page.
If the bit is set then this denotes a BufferedObject page. Today, bits
1 - 3 are used for determining if there are any free line pointers, if
the page is full, and if all tuples on the page are visible to
everyone, respectively. We will use this information accordingly in the
storage manager to determine which callback functions to use for file
I/O operations. This approach allows the buffercache to have an
universal method to quickly determine what type of page it is dealing
with at any time.

Using the new BufferedObject page header will be space efficient but
introduces a significant change in the codebase to now track two types
of page header data. During upgrade, all SLRU files that exist on the
system must be converted to the new format with page header. This will
require rewriting all the SLRU pages with the page header as part of
pg_upgrade.

We believe that this is the correct approach for the long run. We would
love feedback if there are additional items of data that should be
tracked as well. Alternatively, we could re-use the existing page
header and the unused fields could be used as a padding. This feels
like an unclean approach but would avoid having two page header types
in the database.

[1]: /messages/by-id/CA+hUKGKAYze99B-jk9NoMp-2BDqAgiRC4oJv+bFxghNgdieq8Q@mail.gmail.com

Discussed with: Joe Conway, Nathan Bossart, Shawn Debnath

Rishu Bagga

Amazon Web Services (AWS)

Attachments:

0001-heikki-munro-plus-page-header.patchapplication/octet-stream; name=0001-heikki-munro-plus-page-header.patchDownload+1785-3212
#14Ian Lawrence Barwick
barwick@gmail.com
In reply to: Bagga, Rishu (#13)
Re: SLRUs in the main buffer pool - Page Header definitions

2022年9月28日(水) 10:57 Bagga, Rishu <bagrishu@amazon.com>:

Hi,

We have been working on adding page headers to the SLRU pages, as part of the migration for SLRU to buffer cache. We’ve incorporated Thomas Munro’s patch and Heikki’s Storage manager changes[1] and have a patch for early feedback.

As part of our changes we have:

1. Added page headers to the following

*Commit_TS
*CLOG
*MultiXact
*Subtrans
*Serial (predicate.c)
*Notify (async.c)

For commit_ts, clog and MultiXact, the PageXLogRecPtr field is populated with the LSN returned during the creation of a new page; as there is no WAL record for the rest, PageXLogRecPtr is set to “InvalidXlogRecPtr”.

There is one failing assert in predicate.c for SerialPagePrecedes with the page header changes; we are looking into this issue.

The page_version is set to PG_METAPAGE_LAYOUT_VERSION (which is 1)

2. Change block number passed into ReadSlruBuffer from relative to absolute, and account for SLRU’s 256kb segment size in md.c.

The changes pass the regression tests. We are still working on handling the upgrade scenario and should have a patch out for that soon.

Attached is the patch with all changes (Heikki and Munro’s patch and page headers) consolidated

Hi

cfbot reports the patch no longer applies. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.

Thanks

Ian Barwick

#15Bagga, Rishu
bagrishu@amazon.com
In reply to: Ian Lawrence Barwick (#14)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi Heikki and Thomas,

I’ve reworked your patches for moving SLRUs to the buffer cache to add page headers to the SLRUs. Additionally, I’ve rebased this patch on top of the latest commit.

Changes in this patch include:

1. page headers on SLRU pages
2. pg_upgrade support to add page headers to existing on-disk SLRU pages
3. a new ReadBuffer mode RBM_TRIM for TrimCLOG and TrimMultiXact
4. Removing concept of External LSNs introduced in Heikki’s patch, as page headers can now store LSNs normally for SLRUs.¬¬¬¬¬
5. Addressed Serial SLRU asserts in previous patch

We still need to fix asserts triggered from memory allocation in the critical section in Munro’s patch in RecordNewMultiXact.

Currently, in GetNewMultiXact we enter the critical section, and end only after we finish our write, after doing RecordNewMultiXact in MultiXactIdCreateFromMembers. Now that we’re making ReadBuffer calls in RecordNewMultiXact, we allocate memory in the critical section, but this isn’t allowed.

For now, to avoid triggering asserts, I moved the end of the critical section before doing ReadBuffer calls, but this could cause potential data corruption for multixacts in the case ReadBuffer fails.

A potential fix for this issue is to hold on to MultiXactGenLock until we successfully read and write to the buffer, to ensure no but this would cause ReadBuffer to become a bottleneck as no other backends could access the MultiXact state data.

We should figure out a way to allow ReadBuffer calls in critical sections specifically for multixacts, as the current behavior is to panic when multixact data write operations fail.

I would appreciate your thoughts on how we could proceed here.

P.S, Ian, thanks for reminding me to rebase the patch!

Sincerely,

Rishu Bagga,

Amazon Web Services (AWS)

Attachments:

slru_to_buffer_cache_with_page_headers_v2.patchapplication/octet-stream; name=slru_to_buffer_cache_with_page_headers_v2.patchDownload+2200-3348
#16Bagga, Rishu
bagrishu@amazon.com
In reply to: Bagga, Rishu (#15)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

Rebased and updated a new patch addressing the critical section issue in
RecordNewMultliXact.In GetNewMultiXactId, we now make our ReadBuffer
calls before starting the critical section, but while holding the
MultiXactGenLock, so we always fetch the correct buffers. We store them
in an array that is accessed later in RecordNewMultiXact.
This way we can keep the existing functionality of only holding the MultiXactGenLock while reading in buffers, but can let go when we are writing,
to preserve the existing concurrency paradigm.

Let me know your thoughts on this approach.

Sincerely,

Rishu Bagga, Amazon Web Services (AWS)

Attachments:

slru_to_buffer_cache_with_page_headers_v3.patchapplication/octet-stream; name=slru_to_buffer_cache_with_page_headers_v3.patchDownload+2175-3260
#17vignesh C
vignesh21@gmail.com
In reply to: Bagga, Rishu (#16)
Re: SLRUs in the main buffer pool - Page Header definitions

On Fri, 16 Dec 2022 at 04:47, Bagga, Rishu <bagrishu@amazon.com> wrote:

Rebased and updated a new patch addressing the critical section issue in
RecordNewMultliXact.In GetNewMultiXactId, we now make our ReadBuffer
calls before starting the critical section, but while holding the
MultiXactGenLock, so we always fetch the correct buffers. We store them
in an array that is accessed later in RecordNewMultiXact.
This way we can keep the existing functionality of only holding the MultiXactGenLock while reading in buffers, but can let go when we are writing,
to preserve the existing concurrency paradigm.
Let me know your thoughts on this approach.

The patch does not apply on top of HEAD as in [1]http://cfbot.cputube.org/patch_41_3514.log, please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
92957ed98c5c565362ce665266132a7f08f6b0c0 ===
=== applying patch ./slru_to_buffer_cache_with_page_headers_v3.patch
...
patching file src/include/catalog/catversion.h
Hunk #1 FAILED at 57.
1 out of 1 hunk FAILED -- saving rejects to file
src/include/catalog/catversion.h.rej

[1]: http://cfbot.cputube.org/patch_41_3514.log

Regards,
Vignesh

#18Bagga, Rishu
bagrishu@amazon.com
In reply to: vignesh C (#17)
Re: SLRUs in the main buffer pool - Page Header definitions

On 1/3/23, 5:06 AM, "vignesh C" <vignesh21@gmail.com> wrote:

On Fri, 16 Dec 2022 at 04:47, Bagga, Rishu <bagrishu@amazon.com>
wrote:
Rebased and updated a new patch addressing the critical section issue in
RecordNewMultliXact.In GetNewMultiXactId, we now make our ReadBuffer
calls before starting the critical section, but while holding the
MultiXactGenLock, so we always fetch the correct buffers. We store
them in an array that is accessed later in RecordNewMultiXact.
This way we can keep the existing functionality of only holding the
MultiXactGenLock while reading in buffers, but can let go when we are
writing, to preserve the existing concurrency paradigm.
Let me know your thoughts on this approach.

The patch does not apply on top of HEAD as in [1], please post a
rebased patch:
=== Applying patches on top of PostgreSQL commit ID
92957ed98c5c565362ce665266132a7f08f6b0c0 ===
=== applying patch ./slru_to_buffer_cache_with_page_headers_v3.patch

...

patching file src/include/catalog/catversion.h
Hunk #1 FAILED at 57.
1 out of 1 hunk FAILED -- saving rejects to file
src/include/catalog/catversion.h.rej

[1] - http://cfbot.cputube.org/patch _41_3514.log

Regards,
Vignesh

Hi all,

Rebased the patch, and fixed a bug I introduced in the previous patch in
TrimCLOG.

We ran a quick set of pgbench tests and observed no regressions. Here
are the numbers:

3 trials, with scale 10,000, 350 clients, 350 threads, over 30 minutes:

Median TPS:

Control

Trial 1: 58331.0
Trial 2: 57191.0
Trial 3: 57101.3

Average of Medians: 57541.1

SLRUs to BufferCache + Page Headers:

Trial 1: 62605.0
Trial 2: 62891.2
Trial 3: 59906.3

Average of Medians: 61800.8

Machine Specs:

Driver

Instance: m5d.metal
Architecture x86_64
CPUs: 96
RAM: 384 GiB
OS: Amazon Linux 2

Server

Instance: r5dn.metal
Architecture x86_64
CPUs: 64
RAM: 500GiB
OS: Amazon Linux 2

Looking forward to your feedback on this.

Sincerely,
Rishu Bagga, Amazon Web Services (AWS)



Attachments:

slru_to_buffer_cache_with_page_headers_v4.patchapplication/octet-stream; name=slru_to_buffer_cache_with_page_headers_v4.patchDownload+2173-3260
#19Bagga, Rishu
bagrishu@amazon.com
In reply to: Bagga, Rishu (#18)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi all,

Rebased the patch, and fixed a bug I introduced in the previous patch
in
TrimCLOG.

Looking forward to your feedback on this.

Hi all,

Rebased patch as per latest community changes since last email.

Sincerely,

Rishu Bagga, Amazon Web Services (AWS)

Attachments:

slru_to_buffer_cache_with_page_headers_v5.patchapplication/octet-stream; name=slru_to_buffer_cache_with_page_headers_v5.patchDownload+2176-3259
#20Andres Freund
andres@anarazel.de
In reply to: Bagga, Rishu (#19)
Re: SLRUs in the main buffer pool - Page Header definitions

Hi,

From 098f37c0a17fc32a94bff43817414e01fcfb234f Mon Sep 17 00:00:00 2001
From: Rishu Bagga <bagrishu@amazon.com>
Date: Thu, 15 Sep 2022 00:55:25 +0000
Subject: [PATCH] slru to buffercache + page headers + upgrade

---
contrib/amcheck/verify_nbtree.c | 2 +-
[...]
65 files changed, 2176 insertions(+), 3258 deletions(-)

Unfortunately a quite large patch, with this little explanation, is hard to
review. I could read through the entire thread to try to figure out what this
is doing, but I really shouldn't have to.

You're changing quite fundamental APIs across the tree. Why is that required
for the topic at hand? Why is it worth doing that, given it'll break loads of
extensions?

Greetings,

Andres Freund

#21Andres Freund
andres@anarazel.de
In reply to: Bagga, Rishu (#19)
#22Bagga, Rishu
bagrishu@amazon.com
In reply to: Andres Freund (#21)
#23Andres Freund
andres@anarazel.de
In reply to: Bagga, Rishu (#22)
#24Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Andres Freund (#23)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#24)
#26Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#25)
#27Bagga, Rishu
bagrishu@amazon.com
In reply to: Stephen Frost (#26)
#28Stephen Frost
sfrost@snowman.net
In reply to: Bagga, Rishu (#27)
#29Bagga, Rishu
bagrishu@amazon.com
In reply to: Stephen Frost (#28)
#30Nathan Bossart
nathandbossart@gmail.com
In reply to: Bagga, Rishu (#29)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Nathan Bossart (#30)
#32Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#31)
#33Aleksander Alekseev
aleksander@timescale.com
In reply to: Stephen Frost (#32)
#34Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#32)
#35Bagga, Rishu
bagrishu@amazon.com
In reply to: Aleksander Alekseev (#33)
#36Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#34)
#37Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#36)