pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.

Started by Robert Haasover 3 years ago38 messages
#1Robert Haas
Robert Haas
rhaas@postgresql.org

Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.

If TRUNCATE causes some buffers to be invalidated and thus the
checkpoint does not flush them, TRUNCATE must also ensure that the
corresponding files are truncated on disk. Otherwise, a replay
from the checkpoint might find that the buffers exist but have
the wrong contents, which may cause replay to fail.

Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design
suggestion from Heikki Linnakangas, with some changes to the
comments by me. Review of this and a prior patch that approached
the issue differently by Heikki Linnakangas, Andres Freund, Álvaro
Herrera, Masahiko Sawada, and Tom Lane.

Discussion: /messages/by-id/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com

Branch
------
REL_14_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/bbace5697df12398e87ffd9879171c39d27f5b33

Modified Files
--------------
src/backend/access/transam/multixact.c | 6 +++---
src/backend/access/transam/twophase.c | 12 ++++++-----
src/backend/access/transam/xact.c | 5 +++--
src/backend/access/transam/xlog.c | 16 ++++++++++++--
src/backend/access/transam/xloginsert.c | 2 +-
src/backend/catalog/storage.c | 29 +++++++++++++++++++++++++-
src/backend/storage/buffer/bufmgr.c | 6 ++++--
src/backend/storage/ipc/procarray.c | 26 ++++++++++++++++-------
src/backend/storage/lmgr/proc.c | 4 ++--
src/include/storage/proc.h | 37 ++++++++++++++++++++++++++++++++-
src/include/storage/procarray.h | 5 +++--
11 files changed, 120 insertions(+), 28 deletions(-)

#2Markus Wanner
Markus Wanner
markus.wanner@enterprisedb.com
In reply to: Robert Haas (#1)
API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On 24.03.22 20:32, Robert Haas wrote:

Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.

This patch changed the delayChkpt field of struct PGPROC from bool to
int. Back-porting this change could be considered an API breaking
change for extensions using this field.

I'm not certain about padding behavior of compilers in general (or
standards requirements around that), but at least on my machine, it
seems sizeof(PGPROC) did not change, so padding led to subsequent fields
still having the same offset.

Nonetheless, the meaning of the field itself changed. And the
additional assert now also triggers for the following pseudo-code of the
extension I'm concerned about:

/*
* Prevent checkpoints being emitted in between additional
* information in the logical message and the following
* prepare record.
*/
MyProc->delayChkpt = true;

LogLogicalMessage(...);

/* Note that this will also reset the delayChkpt flag. */
PrepareTransaction(...);

Now, I'm well aware this is not an official API, it just happens to be
accessible for extensions. So I guess the underlying question is: What
can extension developers expect? Which parts are okay to change even in
stable branches and which can be relied upon to remain stable?

And for this specific case: Is it worth reverting this change and
applying a fully backwards compatible fix, instead?

Regards

Markus Wanner

#3Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Markus Wanner (#2)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Tue, Apr 5, 2022 at 9:02 AM Markus Wanner
<markus.wanner@enterprisedb.com> wrote:

And for this specific case: Is it worth reverting this change and
applying a fully backwards compatible fix, instead?

I think it's normally our policy to avoid changing definitions of
accessible structs in back branches, except that we allow ourselves
the indulgence of adding new members at the end or in padding space.
So what would probably be best is if, in the back-branches, we changed
"delayChkpt" back to a boolean, renamed it to delayChkptStart, and
added a separate Boolean called delayChkptEnd. Maybe that could be
added just after statusFlags, where I think it would fall into padding
space.

I think as the person who committed that patch I'm on the hook to fix
this if nobody else would like to do it, but let me ask whether
Kyotaro Horiguchi would like to propose a patch, since the original
patch did, and/or whether you would like to propose a patch, as the
person reporting the issue.

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 5, 2022 at 9:02 AM Markus Wanner
<markus.wanner@enterprisedb.com> wrote:

And for this specific case: Is it worth reverting this change and
applying a fully backwards compatible fix, instead?

I think it's normally our policy to avoid changing definitions of
accessible structs in back branches, except that we allow ourselves
the indulgence of adding new members at the end or in padding space.
So what would probably be best is if, in the back-branches, we changed
"delayChkpt" back to a boolean, renamed it to delayChkptStart, and
added a separate Boolean called delayChkptEnd. Maybe that could be
added just after statusFlags, where I think it would fall into padding
space.

Renaming it would constitute an API break, which is if anything worse
than an ABI break.

While we're complaining at you, let me point out that changing a field's
content and semantics while not changing its name is a time bomb waiting
to break any third-party code that looks at (or modifies...) the field.

What I think you need to do is:

1. In the back branches, revert delayChkpt to its previous type and
semantics. Squeeze a separate delayChkptEnd bool in somewhere
(you can't change the struct size either ...).

2. In HEAD, rename the field to something like delayChkptFlags,
to ensure that any code touching it has to be inspected and updated.

In other words, this is already an API break in HEAD, and that's
fine, but it didn't break it hard enough to draw anyone's attention,
which is not fine.

regards, tom lane

#5Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Tue, Apr 5, 2022 at 10:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Renaming it would constitute an API break, which is if anything worse
than an ABI break.

I don't think so, because an API break will cause a compilation
failure, which an extension author can easily fix.

While we're complaining at you, let me point out that changing a field's
content and semantics while not changing its name is a time bomb waiting
to break any third-party code that looks at (or modifies...) the field.

What I think you need to do is:

1. In the back branches, revert delayChkpt to its previous type and
semantics. Squeeze a separate delayChkptEnd bool in somewhere
(you can't change the struct size either ...).

2. In HEAD, rename the field to something like delayChkptFlags,
to ensure that any code touching it has to be inspected and updated.

Well, we can do it that way, I suppose.

--
Robert Haas
EDB: http://www.enterprisedb.com

#6Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#5)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 5, 2022 at 10:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Renaming it would constitute an API break, which is if anything worse
than an ABI break.

I don't think so, because an API break will cause a compilation
failure, which an extension author can easily fix.

My point is that we want that to happen in HEAD, but it's not okay
for it to happen in a minor release of a stable branch.

regards, tom lane

#7Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Tue, Apr 5, 2022 at 10:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 5, 2022 at 10:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Renaming it would constitute an API break, which is if anything worse
than an ABI break.

I don't think so, because an API break will cause a compilation
failure, which an extension author can easily fix.

My point is that we want that to happen in HEAD, but it's not okay
for it to happen in a minor release of a stable branch.

I understand, but I am not sure that I agree. I think that if an
extension stops compiling against a back-branch, someone will notice
the next time they try to compile it and will fix it. Maybe that's not
amazing, but I don't think it's a huge deal either. On the other hand,
if existing builds that someone has already shipped stop working with
a new server release, that's a much larger issue. The extension
packager can't go back and retroactively add a dependency on the
server version to the already-shipped package. A new package can be
shipped and specify a minimum minor version with which it will work,
but an old package is what it is.

--
Robert Haas
EDB: http://www.enterprisedb.com

#8Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 5, 2022 at 10:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

My point is that we want that to happen in HEAD, but it's not okay
for it to happen in a minor release of a stable branch.

I understand, but I am not sure that I agree. I think that if an
extension stops compiling against a back-branch, someone will notice
the next time they try to compile it and will fix it. Maybe that's not
amazing, but I don't think it's a huge deal either.

Well, perhaps it's not the end of the world, but it's still a large
PITA for the maintainer of such an extension. They can't "just fix it"
because some percentage of their userbase will still need to compile
against older minor releases. Nor have you provided any way to handle
that requirement via conditional compilation.

regards, tom lane

#9Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Robert Haas (#1)
Re: API stability

At Tue, 5 Apr 2022 10:01:56 -0400, Robert Haas <robertmhaas@gmail.com> wrote in

I think as the person who committed that patch I'm on the hook to fix
this if nobody else would like to do it, but let me ask whether
Kyotaro Horiguchi would like to propose a patch, since the original
patch did, and/or whether you would like to propose a patch, as the
person reporting the issue.

I'd like to do that. Let me see.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#10Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#9)
Re: API stability

me> I'd like to do that. Let me see.

At Tue, 5 Apr 2022 10:29:03 -0400, Robert Haas <robertmhaas@gmail.com> wrote in

On Tue, Apr 5, 2022 at 10:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I think you need to do is:

1. In the back branches, revert delayChkpt to its previous type and
semantics. Squeeze a separate delayChkptEnd bool in somewhere
(you can't change the struct size either ...).

2. In HEAD, rename the field to something like delayChkptFlags,
to ensure that any code touching it has to be inspected and updated.

Well, we can do it that way, I suppose.

The change is easy on head, but is it better use uint8 instead of int
for delayChkptFlags?

In the back branches, we have, on gcc/Linux/x86-64,
14's PGPROC is 880 bytes and has gaps:

- 6 bytes after statusFlag
- 4 bytes after syncRepState
- 2 bytes after subxidStatus
- 3 bytes after procArrayGroupMember
- 3 bytes after clogGroupMember
- 3 bytes after fpVXIDLock

It seems that we can place the new variable in the first place above,
since the two are not bonded together, or at least in less tightly
bonded than other candidates.

13's PGPROC is 856 bytes and has a 7 bytes gap after delayChkpt.

Versions Ealier than 13 have delayChkpt in PGXACT (12 bytes). It is
tightly packed and dones't have a room for a new member. Can we add
the new flag to PGPROC instead of PGXACT?

12 and 11's PGPROC is 848 bytes and has gaps:
- 4 bytes after syncRepState
- 3 bytes after procArrayGroupMember
- 3 bytes after clogGroupMember
- 4 bytes after clogGroupMemberPage
- 3 bytes after fpVXIDLock

10's PGPROC is 816 bytes and has gaps:
- 4 bytes after cvWaitLink
- 4 bytes after syncRepState
- 3 bytes after procArrayGroupMember
- 3 bytes after fpVXIDLock

So if we don't want to move any member in PGPROC, we do:

14: after statusFlags.
13: after delayChkpt.
12-10: after syncRepState (and before syncRepLinks).

If we allow to shift some members, the new flag can be placed more
saner place.

14: after delayChkpt ((uint8)statusFlags moves forward by 1 byte)
13: after delayChkpt (no member moves)
12-10: after subxids ((bool)procArrayGroupMember moves forward by 1 byte)

I continue working on the last direction above.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#11Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#10)
Re: API stability

At Wed, 06 Apr 2022 14:30:37 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So if we don't want to move any member in PGPROC, we do:

14: after statusFlags.
13: after delayChkpt.
12-10: after syncRepState (and before syncRepLinks).

If we allow to shift some members, the new flag can be placed more
saner place.

14: after delayChkpt ((uint8)statusFlags moves forward by 1 byte)
13: after delayChkpt (no member moves)
12-10: after subxids ((bool)procArrayGroupMember moves forward by 1 byte)

I continue working on the last direction above.

Hmm. That is ABI break. I go with the first way.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#12Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#11)
Re: API stability

At Wed, 06 Apr 2022 15:31:53 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Wed, 06 Apr 2022 14:30:37 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So if we don't want to move any member in PGPROC, we do:

14: after statusFlags.
13: after delayChkpt.
12-10: after syncRepState (and before syncRepLinks).

If we allow to shift some members, the new flag can be placed more
saner place.

14: after delayChkpt ((uint8)statusFlags moves forward by 1 byte)
13: after delayChkpt (no member moves)
12-10: after subxids ((bool)procArrayGroupMember moves forward by 1 byte)

I continue working on the last direction above.

Hmm. That is ABI break. I go with the first way.

By the way, the patch for -14 changed the sigunature of two public
functions.

-GetVirtualXIDsDelayingChkpt(int *nvxids)
+GetVirtualXIDsDelayingChkpt(int *nvxids, int type)
-HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids)
+HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)

Do I need to restore the signature?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#13Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#12)
2 attachment(s)
Re: API stability

At Wed, 06 Apr 2022 15:53:32 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Wed, 06 Apr 2022 15:31:53 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Wed, 06 Apr 2022 14:30:37 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So if we don't want to move any member in PGPROC, we do:

14: after statusFlags.
13: after delayChkpt.
12-10: after syncRepState (and before syncRepLinks).

If we allow to shift some members, the new flag can be placed more
saner place.

14: after delayChkpt ((uint8)statusFlags moves forward by 1 byte)
13: after delayChkpt (no member moves)
12-10: after subxids ((bool)procArrayGroupMember moves forward by 1 byte)

I continue working on the last direction above.

Hmm. That is ABI break. I go with the first way.

By the way, the patch for -14 changed the sigunature of two public
functions.

-GetVirtualXIDsDelayingChkpt(int *nvxids)
+GetVirtualXIDsDelayingChkpt(int *nvxids, int type)
-HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids)
+HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)

Do I need to restore the signature?

For master, renamed delayChkpt to delayChkptFlags and changed it to uint8.

For 14, restored delayChkpt to bool and added delayChkptEnd into a gap in PGPROC, then restored the signature of the two functions above to before the patch. Then added a new functions ..DelayingChkptEnd().

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v1-0001-Rename-and-change-type-of-delayChkpt.patchtext/x-patch; charset=us-ascii
v1-0001-Fix-ABI-API-break_v14.patchtext/x-patch; charset=us-ascii
#14Alvaro Herrera
Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Kyotaro Horiguchi (#13)
Re: API stability

On 2022-Apr-06, Kyotaro Horiguchi wrote:

For master, renamed delayChkpt to delayChkptFlags and changed it to
uint8.

For code documentation purposes, I think it is slightly better to use
bits8 than uint8 for variables where you're storing independent bit flags.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)

#15Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Alvaro Herrera (#14)
Re: API stability

At Wed, 6 Apr 2022 10:30:32 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2022-Apr-06, Kyotaro Horiguchi wrote:

For master, renamed delayChkpt to delayChkptFlags and changed it to
uint8.

For code documentation purposes, I think it is slightly better to use
bits8 than uint8 for variables where you're storing independent bit flags.

Oh, agreed. Will fix in the next version along with other fixes.

Thanks!

--
Kyotaro Horiguchi
NTT Open Source Software Center

#16Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#15)
Re: API stability

At Wed, 06 Apr 2022 18:13:17 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Wed, 6 Apr 2022 10:30:32 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

For code documentation purposes, I think it is slightly better to use
bits8 than uint8 for variables where you're storing independent bit flags.

Oh, agreed. Will fix in the next version along with other fixes.

The immediately folloing member statusFlags is in uint8. So using
bits8 here results in the following look.

bits8 delayChkptFlags;/* for DELAY_CHKPT_* flags */

uint8 statusFlags; /* this backend's status flags, see PROC_*
* above. mirrored in

PGPROC has another member that fits bits*.

uint64 fpLockBits; /* lock modes held for each fast-path slot */

Do I change this in this patch? Or leave them for another chance?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#17Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#8)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Tue, Apr 05, 2022 at 03:16:20PM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 5, 2022 at 10:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

My point is that we want that to happen in HEAD, but it's not okay
for it to happen in a minor release of a stable branch.

I understand, but I am not sure that I agree. I think that if an
extension stops compiling against a back-branch, someone will notice
the next time they try to compile it and will fix it. Maybe that's not
amazing, but I don't think it's a huge deal either.

I agree with Tom's argument. The internals of this structure should
not have changed in a stable branch.

Well, perhaps it's not the end of the world, but it's still a large
PITA for the maintainer of such an extension. They can't "just fix it"
because some percentage of their userbase will still need to compile
against older minor releases. Nor have you provided any way to handle
that requirement via conditional compilation.

For example, I recall that some external extensions make use of
sizeof(PGPROC) for their own business. Isn't 412ad7a going to be a
problem to change this structure's internals for already-compiled code
on stable branches?
--
Michael

#18Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#17)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Thu, Apr 7, 2022 at 2:28 AM Michael Paquier <michael@paquier.xyz> wrote:

Well, perhaps it's not the end of the world, but it's still a large
PITA for the maintainer of such an extension. They can't "just fix it"
because some percentage of their userbase will still need to compile
against older minor releases. Nor have you provided any way to handle
that requirement via conditional compilation.

For example, I recall that some external extensions make use of
sizeof(PGPROC) for their own business. Isn't 412ad7a going to be a
problem to change this structure's internals for already-compiled code
on stable branches?

I don't think that commit changed sizeof(PGPROC), but it did affect
the position of the delayChkpt and statusFlags members within the
struct, which is what we now need to fix. Since I don't hear anyone
else volunteering to take care of that, I'll go work on it.

--
Robert Haas
EDB: http://www.enterprisedb.com

#19Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
2 attachment(s)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Tue, Apr 5, 2022 at 10:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I think you need to do is:

1. In the back branches, revert delayChkpt to its previous type and
semantics. Squeeze a separate delayChkptEnd bool in somewhere
(you can't change the struct size either ...).

2. In HEAD, rename the field to something like delayChkptFlags,
to ensure that any code touching it has to be inspected and updated.

Here are patches for master and v14 to do things this way. Comments?

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

0001-Rename-delayChkpt-to-delayChkptFlags.patchapplication/octet-stream; name=0001-Rename-delayChkpt-to-delayChkptFlags.patch
0001-Rethink-the-delay-checkpoint-end-mechanism-in-the-ba.patchapplication/octet-stream; name=0001-Rethink-the-delay-checkpoint-end-mechanism-in-the-ba.patch
#20Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

Robert Haas <robertmhaas@gmail.com> writes:

Here are patches for master and v14 to do things this way. Comments?

WFM.

regards, tom lane

#21Petr Jelinek
Petr Jelinek
petr.jelinek@enterprisedb.com
In reply to: Robert Haas (#19)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On 7. 4. 2022, at 17:19, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 5, 2022 at 10:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I think you need to do is:

1. In the back branches, revert delayChkpt to its previous type and
semantics. Squeeze a separate delayChkptEnd bool in somewhere
(you can't change the struct size either ...).

2. In HEAD, rename the field to something like delayChkptFlags,
to ensure that any code touching it has to be inspected and updated.

Here are patches for master and v14 to do things this way. Comments?

Yeah I think this should do it (compilers should warn on master even without the rename, but who notices that right? :) )

Thanks,
Petr

#22Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#19)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Thu, Apr 07, 2022 at 11:19:15AM -0400, Robert Haas wrote:

Here are patches for master and v14 to do things this way. Comments?

Thanks for the patches. They look correct. For ~14, I'd rather avoid
the code duplication done by GetVirtualXIDsDelayingChkptEnd() and
HaveVirtualXIDsDelayingChkpt() that could be avoided with an extra
bool argument to the existing routine. The same kind of duplication
happens with GetVirtualXIDsDelayingChkpt() and
GetVirtualXIDsDelayingChkptEnd().
--
Michael

#23Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#22)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

Michael Paquier <michael@paquier.xyz> writes:

On Thu, Apr 07, 2022 at 11:19:15AM -0400, Robert Haas wrote:

Here are patches for master and v14 to do things this way. Comments?

Thanks for the patches. They look correct. For ~14, I'd rather avoid
the code duplication done by GetVirtualXIDsDelayingChkptEnd() and
HaveVirtualXIDsDelayingChkpt() that could be avoided with an extra
bool argument to the existing routine.

Isn't adding another argument an API break? (If there's any outside
code calling GetVirtualXIDsDelayingChkpt, which it seems like there
might be.)

regards, tom lane

#24Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#22)
Re: API stability

At Fri, 8 Apr 2022 08:47:42 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Thu, Apr 07, 2022 at 11:19:15AM -0400, Robert Haas wrote:

Here are patches for master and v14 to do things this way. Comments?

Thanks for the patches. They look correct. For ~14, I'd rather avoid
the code duplication done by GetVirtualXIDsDelayingChkptEnd() and
HaveVirtualXIDsDelayingChkpt() that could be avoided with an extra
bool argument to the existing routine. The same kind of duplication
happens with GetVirtualXIDsDelayingChkpt() and
GetVirtualXIDsDelayingChkptEnd().

FWIW, it is done in [1]/messages/by-id/20220406.164521.17171257901083417.horikyota.ntt@gmail.com.

[1]: /messages/by-id/20220406.164521.17171257901083417.horikyota.ntt@gmail.com

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#25Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#24)
Re: API stability

(Mmm. My mailer automatically teared off the [was: ..] part from the
subject..)

--
Kyotaro Horiguchi
NTT Open Source Software Center

#26Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#23)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Thu, Apr 7, 2022 at 7:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael@paquier.xyz> writes:

On Thu, Apr 07, 2022 at 11:19:15AM -0400, Robert Haas wrote:

Here are patches for master and v14 to do things this way. Comments?

Thanks for the patches. They look correct. For ~14, I'd rather avoid
the code duplication done by GetVirtualXIDsDelayingChkptEnd() and
HaveVirtualXIDsDelayingChkpt() that could be avoided with an extra
bool argument to the existing routine.

Isn't adding another argument an API break? (If there's any outside
code calling GetVirtualXIDsDelayingChkpt, which it seems like there
might be.)

Yeah, that's exactly why I didn't do what Michael proposes. If we're
going to go to this trouble to avoid changing the layout of a PGPROC,
we must be doing that on the theory that extension code cares about
delayChkpt. And if that is so, it seems reasonable to suppose that it
might also want to call the associated functions.

Honestly, I wouldn't have thought that this mattered, because I
wouldn't have guessed that any non-core code cared about delayChkpt.
But I would have been wrong.

--
Robert Haas
EDB: http://www.enterprisedb.com

#27Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#26)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Thu, Apr 07, 2022 at 10:19:35PM -0400, Robert Haas wrote:

Yeah, that's exactly why I didn't do what Michael proposes. If we're
going to go to this trouble to avoid changing the layout of a PGPROC,
we must be doing that on the theory that extension code cares about
delayChkpt. And if that is so, it seems reasonable to suppose that it
might also want to call the associated functions.

Compatibility does not strike me as a problem with two static inline
functions used as wrappers of their common logic.

Honestly, I wouldn't have thought that this mattered, because I
wouldn't have guessed that any non-core code cared about delayChkpt.
But I would have been wrong.

That's a minor point. If you wish to keep this code as you are
proposing, that's fine as well by me.
--
Michael

#28Markus Wanner
Markus Wanner
markus.wanner@enterprisedb.com
In reply to: Michael Paquier (#22)
1 attachment(s)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Fri, 2022-04-08 at 08:47 +0900, Michael Paquier wrote:

On Thu, Apr 07, 2022 at 11:19:15AM -0400, Robert Haas wrote:

Here are patches for master and v14 to do things this way.
Comments?

Thanks for the patches.  They look correct.

+1, looks good to me and addresses my specific original concern.

For ~14, I'd rather avoid
the code duplication done by GetVirtualXIDsDelayingChkptEnd() and
HaveVirtualXIDsDelayingChkpt() that could be avoided with an extra
bool argument to the existing routine.  The same kind of duplication
happens with GetVirtualXIDsDelayingChkpt() and
GetVirtualXIDsDelayingChkptEnd().

I agree with Michael, it would be nice to not duplicate the code, but
use a common underlying method. A modified patch is attached.

Best Regards

Markus

Attachments:

0001-Rethink-the-delay-v2.patchtext/x-patch; charset=UTF-8; name=0001-Rethink-the-delay-v2.patch
#29Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Markus Wanner (#28)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Fri, Apr 8, 2022 at 4:47 AM Markus Wanner
<markus.wanner@enterprisedb.com> wrote:

I agree with Michael, it would be nice to not duplicate the code, but
use a common underlying method. A modified patch is attached.

I don't think this is better, but I don't think it's worth arguing
about, either, so I'll do it this way if nobody objects.

Meanwhile, I've committed the patch for master to master.

--
Robert Haas
EDB: http://www.enterprisedb.com

#30Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Tom Lane (#8)
Re: API stability

(a bit off-topic)

I'm not sure where I am..

At Wed, 06 Apr 2022 10:36:30 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
me> > this if nobody else would like to do it, but let me ask whether
me> > Kyotaro Horiguchi would like to propose a patch, since the original
me> > patch did, and/or whether you would like to propose a patch, as the
me> > person reporting the issue.
me>
me> I'd like to do that. Let me see.

At Thu, 7 Apr 2022 10:04:20 -0400, Robert Haas <robertmhaas@gmail.com> wrote in

struct, which is what we now need to fix. Since I don't hear anyone
else volunteering to take care of that, I'll go work on it.

Just confirmation. Is my message above didn't look like declaring that
I'd like to volunteering? If so, please teach me the correct way to
say that, since I don't want to repeat the same mistake. Or are there
some other reasons? (Sorry if this looks like a blame, but I asking
plainly (really:).)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#31Matthias van de Meent
Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Kyotaro Horiguchi (#30)
Re: API stability

On Mon, 11 Apr 2022 at 06:30, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

(a bit off-topic)

I'm not sure where I am..

At Wed, 06 Apr 2022 10:36:30 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
me> > this if nobody else would like to do it, but let me ask whether
me> > Kyotaro Horiguchi would like to propose a patch, since the original
me> > patch did, and/or whether you would like to propose a patch, as the
me> > person reporting the issue.
me>
me> I'd like to do that. Let me see.

At Thu, 7 Apr 2022 10:04:20 -0400, Robert Haas <robertmhaas@gmail.com> wrote in

struct, which is what we now need to fix. Since I don't hear anyone
else volunteering to take care of that, I'll go work on it.

Just confirmation. Is my message above didn't look like declaring that
I'd like to volunteering? If so, please teach me the correct way to
say that, since I don't want to repeat the same mistake. Or are there
some other reasons? (Sorry if this looks like a blame, but I asking
plainly (really:).)

I won't speak for Robert H., but this might be because of gmail not
putting this mail in the right thread: Your mail client dropped the
"[was: pgsql: ...]" tag, which Gmail subsequently displays as a
different thread (that is, in my Gmail UI there's three "Re: API
stability" threads, one of which has the [was: pgsql: ...]-tag, and
two of which seem to be started by you as a reply on the original
thread, but with the [was: pgsql: ...]-tag dropped and thus considered
a new thread).

So, this might be the reason Robert overlooked your declaration to
volunteer: he was looking for volunteers in the thread "Re: API
Stability [was: pgsql: ...]" in the Gmail UI, which didn't show your
messages there because of the different subject line.

Kind regards,

Matthias van de Meent

#32Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Matthias van de Meent (#31)
Re: API stability

On Mon, Apr 11, 2022 at 6:48 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

So, this might be the reason Robert overlooked your declaration to
volunteer: he was looking for volunteers in the thread "Re: API
Stability [was: pgsql: ...]" in the Gmail UI, which didn't show your
messages there because of the different subject line.

Yes, that's what happened.

--
Robert Haas
EDB: http://www.enterprisedb.com

#33Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#29)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Fri, Apr 8, 2022 at 11:50 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 8, 2022 at 4:47 AM Markus Wanner
<markus.wanner@enterprisedb.com> wrote:

I agree with Michael, it would be nice to not duplicate the code, but
use a common underlying method. A modified patch is attached.

I don't think this is better, but I don't think it's worth arguing
about, either, so I'll do it this way if nobody objects.

Meanwhile, I've committed the patch for master to master.

Well, I've just realized that Kyotaro Horiguchi volunteered to fix
this on an email thread I did not see because of the way Gmail breaks
the thread if you change the subject line. And he developed a very
similar patch to what we have here. I'm going to use this one as the
basis for going forward because I've already studied it in detail and
it's less work for me to stick with what I know than to go study
something else. But, he also noticed something which we didn't notice
here, which is that before v13, the commit in question actually
changed the size of PGXACT, which is really quite bad -- it needs to
be 12 bytes for performance reasons. And there's no spare bytes
available, so I think we should follow one of the suggestions that he
had over in that email thread, and put delayChkptEnd in PGPROC even
though delayChkpt is in PGXACT.

Comments?

--
Robert Haas
EDB: http://www.enterprisedb.com

#34Markus Wanner
Markus Wanner
markus.wanner@enterprisedb.com
In reply to: Robert Haas (#33)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Mon, 2022-04-11 at 15:21 -0400, Robert Haas wrote:

... before v13, the commit in question actually
changed the size of PGXACT, which is really quite bad -- it needs to
be 12 bytes for performance reasons. And there's no spare bytes
available, so I think we should follow one of the suggestions that he
had over in that email thread, and put delayChkptEnd in PGPROC even
though delayChkpt is in PGXACT.

This makes sense to me. Kudos to Kyotaro for considering this.

At first read, this sounded like a trade-off between compatibility and
performance for PG 12 and older. But I realize leaving delayChkpt in
PGXACT and adding just delayChkptEnd to PGPROC is compatible and leaves
PGXACT at a size of 12 bytes. So this sounds like a good approach to
me.

Best Regards

Markus

#35Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Matthias van de Meent (#31)
Re: API stability

At Mon, 11 Apr 2022 12:48:25 +0200, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote in

On Mon, 11 Apr 2022 at 06:30, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
I won't speak for Robert H., but this might be because of gmail not
putting this mail in the right thread: Your mail client dropped the
"[was: pgsql: ...]" tag, which Gmail subsequently displays as a
different thread (that is, in my Gmail UI there's three "Re: API
stability" threads, one of which has the [was: pgsql: ...]-tag, and
two of which seem to be started by you as a reply on the original
thread, but with the [was: pgsql: ...]-tag dropped and thus considered
a new thread).

Mmm. d*** gmail.. My main mailer does that defaltly but I think I can
*fix* that behavior.

Thanks!

--
Kyotaro Horiguchi
NTT Open Source Software Center

#36Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Markus Wanner (#34)
5 attachment(s)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

(My mailer has been fixed.)

At Mon, 11 Apr 2022 21:45:59 +0200, Markus Wanner <markus.wanner@enterprisedb.com> wrote in

On Mon, 2022-04-11 at 15:21 -0400, Robert Haas wrote:

... before v13, the commit in question actually
changed the size of PGXACT, which is really quite bad -- it needs to
be 12 bytes for performance reasons. And there's no spare bytes
available, so I think we should follow one of the suggestions that he
had over in that email thread, and put delayChkptEnd in PGPROC even
though delayChkpt is in PGXACT.

This makes sense to me. Kudos to Kyotaro for considering this.

At first read, this sounded like a trade-off between compatibility and
performance for PG 12 and older. But I realize leaving delayChkpt in
PGXACT and adding just delayChkptEnd to PGPROC is compatible and leaves
PGXACT at a size of 12 bytes. So this sounds like a good approach to
me.

Thanks!

So, I created the patches for back-patching from 10 to 14. (With
fixed a silly bug of the v1-pg14 that HaveVirtualXIDsDelayingChkpt and
HaveVirtualXIDsDelayingChkptEnd are inverted..)

They revert delayChkpt-related changes made by the patch and add
delayChkptEnd stuff. I compared among every pair of the patches for
neighbouring versions, to make sure not making the same change in
different way and they have the same set of hunks.

This version takes the way sharing the common static function
(*ChkptGuts) between the functions *Chkpt() and *ChkptEnd().

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v2-0001-Fix-ABI-API-break_10.patchtext/x-patch; charset=us-ascii
v2-0001-Fix-ABI-API-break_14.patchtext/x-patch; charset=us-ascii
v2-0001-Fix-ABI-API-break_13.patchtext/x-patch; charset=us-ascii
v2-0001-Fix-ABI-API-break_12.patchtext/x-patch; charset=us-ascii
v2-0001-Fix-ABI-API-break_11.patchtext/x-patch; charset=us-ascii
#37Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro Horiguchi (#36)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

On Tue, Apr 12, 2022 at 6:55 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

(My mailer has been fixed.)

Cool.

So, I created the patches for back-patching from 10 to 14. (With
fixed a silly bug of the v1-pg14 that HaveVirtualXIDsDelayingChkpt and
HaveVirtualXIDsDelayingChkptEnd are inverted..)

I am very sorry not to use these, but as I said in my previous post on
this thread, I prefer to commit what I wrote and Markus revised rather
than these versions. I have done that now.

--
Robert Haas
EDB: http://www.enterprisedb.com

#38Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Robert Haas (#37)
Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]

At Thu, 14 Apr 2022 11:13:01 -0400, Robert Haas <robertmhaas@gmail.com> wrote in

On Tue, Apr 12, 2022 at 6:55 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

(My mailer has been fixed.)

Cool.

So, I created the patches for back-patching from 10 to 14. (With
fixed a silly bug of the v1-pg14 that HaveVirtualXIDsDelayingChkpt and
HaveVirtualXIDsDelayingChkptEnd are inverted..)

I am very sorry not to use these, but as I said in my previous post on
this thread, I prefer to commit what I wrote and Markus revised rather
than these versions. I have done that now.

Not at all. It's just an unfortunate crossing.
Thanks for fixing this.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center