WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts

Started by Matthias van de Meentalmost 2 years ago7 messageshackers
Jump to latest
#1Matthias van de Meent
boekewurm+postgres@gmail.com

Hi,

My collegue Konstantin Knizhnik pointed out that we fail to mark pages
with a non-standard page layout with page_std=false in
RelationCopyStorageUsingBuffer when we WAL-log them. This causes us to
interpret the registered buffer as a standard buffer, and omit the
hole in the page, which for FSM/VM pages covers the whole page.

The immediate effect of this bug is that replicas and primaries in a
physical replication system won't have the same data in their VM- and
FSM-forks until the first VACUUM on the new database has WAL-logged
these pages again. Whilst not actively harmful for the VM/FSM
subsystems, it's definitely suboptimal.
Secondary unwanted effects are that AMs that use the buffercache- but
which don't use or update the pageheader- also won't see the main data
logged in WAL, thus potentially losing user data in the physical
replication stream or with a system crash. I've not looked for any
such AMs and am unaware of any that would have this issue, but it's
better to fix this.

PFA a patch that fixes this issue, by assuming that all pages in the
source database utilize a non-standard page layout.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Attachments:

v1-0001-Fix-logging-of-non-standard-pages-in-RelationCopy.patchapplication/octet-stream; name=v1-0001-Fix-logging-of-non-standard-pages-in-RelationCopy.patchDownload+6-3
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias van de Meent (#1)
Re: WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

PFA a patch that fixes this issue, by assuming that all pages in the
source database utilize a non-standard page layout.

Surely that cure is worse than the disease?

regards, tom lane

#3Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Tom Lane (#2)
Re: WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts

On Mon, 13 May 2024 at 16:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

PFA a patch that fixes this issue, by assuming that all pages in the
source database utilize a non-standard page layout.

Surely that cure is worse than the disease?

I don't know where we would get the information whether the selected
relation fork's pages are standard-compliant. We could base it off of
the fork number (that info is available locally) but that doesn't
guarantee much.
For VM and FSM-pages we know they're essentially never
standard-compliant (hence this thread), but for the main fork it is
anyone's guess once the user has installed an additional AM - which we
don't detect nor pass through to the offending
RelationCopyStorageUsingBuffer.

As for "worse", the default template database is still much smaller
than the working set of most databases. This will indeed regress the
workload a bit, but only by the fraction of holes in the page + all
FSM/VM data.
I think the additional WAL volume during CREATE DATABASE is worth it
when the alternative is losing that data with physical
replication/secondary instances. Note that this does not disable page
compression, it just stops the logging of holes in pages; holes which
generally are only a fraction of the whole database.

It's not inconceivable that this will significantly increase WAL
volume, but I think we should go for correctness rather than fastest
copy. If we went with fastest copy, we'd better just skip logging the
FSM and VM forks because we're already ignoring the data of the pages,
so why not ignore the pages themselves, too? I don't think that holds
water when we want to be crash-proof in CREATE DATABASE, with a full
data copy of the template database.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#4Robert Haas
robertmhaas@gmail.com
In reply to: Matthias van de Meent (#3)
Re: WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts

On Mon, May 13, 2024 at 10:53 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

It's not inconceivable that this will significantly increase WAL
volume, but I think we should go for correctness rather than fastest
copy.

I don't think we can afford to just do this blindly for the sake of a
hypothetical non-core AM that uses nonstandard pages. There must be
lots of cases where the holes are large, and where the WAL volume
would be a multiple of what it is currently. That's a *big*
regression.

If we went with fastest copy, we'd better just skip logging the
FSM and VM forks because we're already ignoring the data of the pages,
so why not ignore the pages themselves, too? I don't think that holds
water when we want to be crash-proof in CREATE DATABASE, with a full
data copy of the template database.

This seems like a red herring. Either assuming standard pages is a
good idea or it isn't, and either logging the FSM and VM forks is a
good idea or it isn't, but those are two separate questions.

--
Robert Haas
EDB: http://www.enterprisedb.com

#5Akshat Jaimini
destrex271@gmail.com
In reply to: Robert Haas (#4)
Re: WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts

Hi,

Quick question, are there any more revisions left to be done on this patch from the previous feedback?
Or should I continue with reviewing the current patch?

Regards,
Akshat Jaimini

#6Michael Paquier
michael@paquier.xyz
In reply to: Akshat Jaimini (#5)
Re: WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts

On Sat, Sep 14, 2024 at 06:57:21PM +0000, Akshat Jaimini wrote:

Quick question, are there any more revisions left to be done on this
patch from the previous feedback?

This patch is still listed in the CF app waiting on author with what
looks like Robert and Tom objecting to it, because it makes all
callers of CreateAndCopyRelationData() much more expensive in terms of
WAL generated when copying these files as the holes of the pages would
be included. This would also make the compression of the FPWs more
expensive.

One potential way to move forward would be to spread the knowledge of
page_std when logging a new page higher in the callers, then allow
dbcommands.c to be smarter about that?
--
Michael

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#6)
Re: WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts

On 2024-Dec-10, Michael Paquier wrote:

One potential way to move forward would be to spread the knowledge of
page_std when logging a new page higher in the callers, then allow
dbcommands.c to be smarter about that?

Matthias, is this patch still under consideration?

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Update: super-fast reaction on the Postgres bugs mailing list. The report
was acknowledged [...], and a fix is under discussion.
The wonders of open-source !"
https://twitter.com/gunnarmorling/status/1596080409259003906