Add 64-bit XIDs into PostgreSQL 15

Started by Maxim Orlovalmost 4 years ago205 messages

Threaded Flat

Oldest First Newest First

Maxim Orlov

orlovmg@gmail.com

almost 4 years ago

2 attachment(s)

Hi, hackers!

Long time wraparound was a really big pain for highly loaded systems. One
source of performance degradation is the need to vacuum before every
wraparound.
And there were several proposals to make XIDs 64-bit like [1]/messages/by-id/1611355191319-0.post@n3.nabble.com, [2]/messages/by-id/DA1E65A4-7C5A-461D-B211-2AD5F9A6F2FD@gmail.com, [3]/messages/by-id/CAPpHfduQ7KuCHvg3dHx+9Pwp_rNf705bjdRCrR_Cqiv_co4H9A@mail.gmail.com and
[4]: /messages/by-id/51957591572599112@myt5-3a82a06244de.qloud-c.yandex.net

The approach [2]/messages/by-id/DA1E65A4-7C5A-461D-B211-2AD5F9A6F2FD@gmail.com seems to have stalled on CF since 2018. But meanwhile it
was successfully being used in our Postgres Pro fork all time since then.
We have hundreds of customers using 64-bit XIDs. Dozens of instances are
under load that require wraparound each 1-5 days with 32-bit XIDs.
It really helps the customers with a huge transaction load that in the case
of 32-bit XIDs could experience wraparounds every day. So I'd like to
propose this approach modification to CF.

PFA updated working patch v6 for PG15 development cycle.
It is based on a patch by Alexander Korotkov version 5 [5]/messages/by-id/CAPpHfdseWf0QLWMAhLgiyP4u+5WUondzdQ_Yd-eeF=Duj=Vq0g@mail.gmail.com with a few
fixes, refactoring and was rebased to PG15.

Main changes:
- Change TransactionId to 64bit
- Disk tuple format (HeapTupleHeader) is unchanged: xmin and xmax
remains 32bit
-- 32bit xid is named ShortTransactionId now.
-- Exception: see "double xmax" format below.
- Heap page format is changed to contain xid and multixact base value,
tuple's xmin and xmax are offsets from.
-- xid_base and multi_base are stored as a page special data. PageHeader
remains unmodified.
-- If after upgrade page has no free space for special data, tuples are
converted to "double xmax" format: xmin became virtual
FrozenTransactionId, xmax occupies the whole 64bit.
Page converted to new format when vacuum frees enough space.
- In-memory tuples (HeapTuple) were enriched with copies of xid_base and
multi_base from a page.

ToDo:
- replace xid_base/multi_base in HeapTuple with precalculated 64bit
xmin/xmax.
- attempt to split the patch into "in-memory" part (add xmin/xmax to
HeapTuple) and "storage" format change.
- try to implement the storage part as a table access method.

Your opinions are very much welcome!

[1]: /messages/by-id/1611355191319-0.post@n3.nabble.com
/messages/by-id/1611355191319-0.post@n3.nabble.com
[2]: /messages/by-id/DA1E65A4-7C5A-461D-B211-2AD5F9A6F2FD@gmail.com
/messages/by-id/DA1E65A4-7C5A-461D-B211-2AD5F9A6F2FD@gmail.com
[3]: /messages/by-id/CAPpHfduQ7KuCHvg3dHx+9Pwp_rNf705bjdRCrR_Cqiv_co4H9A@mail.gmail.com
/messages/by-id/CAPpHfduQ7KuCHvg3dHx+9Pwp_rNf705bjdRCrR_Cqiv_co4H9A@mail.gmail.com
[4]: /messages/by-id/51957591572599112@myt5-3a82a06244de.qloud-c.yandex.net
/messages/by-id/51957591572599112@myt5-3a82a06244de.qloud-c.yandex.net
[5]: /messages/by-id/CAPpHfdseWf0QLWMAhLgiyP4u+5WUondzdQ_Yd-eeF=Duj=Vq0g@mail.gmail.com
/messages/by-id/CAPpHfdseWf0QLWMAhLgiyP4u+5WUondzdQ_Yd-eeF=Duj=Vq0g@mail.gmail.com

Finnerty, Jim

jfinnert@amazon.com

almost 4 years ago

In reply to: Maxim Orlov (#1)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Maxim,

I’m glad to see that you’re trying to carry the 64-bit XID work forward. I had not noticed that my earlier patch (also derived from Alexander Kortkov’s patch) was responded to back in September. Perhaps we can merge some of the code cleanup that it contained, such as using XID_FMT everywhere and creating a type for the kind of page returned by TransactionIdToPage() to make the code cleaner.

Is your patch functionally the same as the PostgresPro implementation? If so, I think it would be helpful for everyone’s understanding to read the PostgresPro documentation on VACUUM. See in particular section “Forced shrinking pg_clog and pg_multixact”

https://postgrespro.com/docs/enterprise/9.6/routine-vacuuming#vacuum-for-wraparound

best regards,

/Jim

Stephen Frost

sfrost@snowman.net

almost 4 years ago

In reply to: Maxim Orlov (#1)

Re: Add 64-bit XIDs into PostgreSQL 15

Greetings,

* Maxim Orlov (orlovmg@gmail.com) wrote:

Long time wraparound was a really big pain for highly loaded systems. One
source of performance degradation is the need to vacuum before every
wraparound.
And there were several proposals to make XIDs 64-bit like [1], [2], [3] and
[4] to name a few.

The approach [2] seems to have stalled on CF since 2018. But meanwhile it
was successfully being used in our Postgres Pro fork all time since then.
We have hundreds of customers using 64-bit XIDs. Dozens of instances are
under load that require wraparound each 1-5 days with 32-bit XIDs.
It really helps the customers with a huge transaction load that in the case
of 32-bit XIDs could experience wraparounds every day. So I'd like to
propose this approach modification to CF.

PFA updated working patch v6 for PG15 development cycle.
It is based on a patch by Alexander Korotkov version 5 [5] with a few
fixes, refactoring and was rebased to PG15.

Just to confirm as I only did a quick look- if a transaction in such a
high rate system lasts for more than a day (which certainly isn't
completely out of the question, I've had week-long transactions
before..), and something tries to delete a tuple which has tuples on it
that can't be frozen yet due to the long-running transaction- it's just
going to fail?

Not saying that I've got any idea how to fix that case offhand, and we
don't really support such a thing today as the server would just stop
instead, but if I saw something in the release notes talking about PG
moving to 64bit transaction IDs, I'd probably be pretty surprised to
discover that there's still a 32bit limit that you have to watch out for
or your system will just start failing transactions. Perhaps that's a
worthwhile tradeoff for being able to generally avoid having to vacuum
and deal with transaction wrap-around, but I have to wonder if there
might be a better answer. Of course, also wonder about how we're going
to document and monitor for this potential issue and what kind of
corrective action will be needed (kill transactions older than a cerain
amount of transactions..?).

Thanks,

Stephen

Finnerty, Jim

jfinnert@amazon.com

almost 4 years ago

In reply to: Stephen Frost (#3)

Re: Add 64-bit XIDs into PostgreSQL 15

On 1/4/22, 2:35 PM, "Stephen Frost" <sfrost@snowman.net> wrote:

Not saying that I've got any idea how to fix that case offhand, and we
don't really support such a thing today as the server would just stop
instead, ...
Perhaps that's a
worthwhile tradeoff for being able to generally avoid having to vacuum
and deal with transaction wrap-around, but I have to wonder if there
might be a better answer.

For the target use cases that PostgreSQL is designed for, it's a very worthwhile tradeoff in my opinion. Such long-running transactions need to be killed.

Re: -- If after upgrade page has no free space for special data, tuples are
converted to "double xmax" format: xmin became virtual
FrozenTransactionId, xmax occupies the whole 64bit.
Page converted to new format when vacuum frees enough space.

I'm concerned about the maintainability impact of having 2 new on-disk page formats. It's already complex enough with XIDs and multixact-XIDs.

If the lack of space for the two epochs in the special data area is a problem only in an upgrade scenario, why not resolve the problem before completing the upgrade process like a kind of post-process pg_repack operation that converts all "double xmax" pages to the "double-epoch" page format? i.e. maybe the "double xmax" representation is needed as an intermediate representation during upgrade, but after upgrade completes successfully there are no pages with the "double-xmax" representation. This would eliminate a whole class of coding errors and would make the code dealing with 64-bit XIDs simpler and more maintainable.

/Jim

Fujii Masao

masao.fujii@oss.nttdata.com

almost 4 years ago

In reply to: Maxim Orlov (#1)

Re: Add 64-bit XIDs into PostgreSQL 15

On 2021/12/30 21:15, Maxim Orlov wrote:

Hi, hackers!

Long time wraparound was a really big pain for highly loaded systems. One source of performance degradation is the need to vacuum before every wraparound.
And there were several proposals to make XIDs 64-bit like [1], [2], [3] and [4] to name a few.

The approach [2] seems to have stalled on CF since 2018. But meanwhile it was successfully being used in our Postgres Pro fork all time since then. We have hundreds of customers using 64-bit XIDs. Dozens of instances are under load that require wraparound each 1-5 days with 32-bit XIDs.
It really helps the customers with a huge transaction load that in the case of 32-bit XIDs could experience wraparounds every day. So I'd like to propose this approach modification to CF.

PFA updated working patch v6 for PG15 development cycle.
It is based on a patch by Alexander Korotkov version 5 [5] with a few fixes, refactoring and was rebased to PG15.

Thanks a lot! I'm really happy to see this proposal again!!

Is there any documentation or README explaining this whole 64-bit XID mechanism?

Could you tell me what happens if new tuple with XID larger than xid_base + 0xFFFFFFFF is inserted into the page? Such new tuple is not allowed to be inserted into that page? Or xid_base and xids of all existing tuples in the page are increased? Also what happens if one of those xids (of existing tuples) cannot be changed because the tuple still can be seen by very-long-running transaction?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Simon Riggs

simon.riggs@enterprisedb.com

almost 4 years ago

In reply to: Maxim Orlov (#1)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, 30 Dec 2021 at 13:19, Maxim Orlov <orlovmg@gmail.com> wrote:

Hi, hackers!

Long time wraparound was a really big pain for highly loaded systems. One source of performance degradation is the need to vacuum before every wraparound.
And there were several proposals to make XIDs 64-bit like [1], [2], [3] and [4] to name a few.

Very good to see this revived.

PFA updated working patch v6 for PG15 development cycle.
It is based on a patch by Alexander Korotkov version 5 [5] with a few fixes, refactoring and was rebased to PG15.

Main changes:
- Change TransactionId to 64bit

This sounds like a good route to me.

- Disk tuple format (HeapTupleHeader) is unchanged: xmin and xmax
remains 32bit
-- 32bit xid is named ShortTransactionId now.
-- Exception: see "double xmax" format below.
- Heap page format is changed to contain xid and multixact base value,
tuple's xmin and xmax are offsets from.
-- xid_base and multi_base are stored as a page special data. PageHeader
remains unmodified.
-- If after upgrade page has no free space for special data, tuples are
converted to "double xmax" format: xmin became virtual
FrozenTransactionId, xmax occupies the whole 64bit.
Page converted to new format when vacuum frees enough space.
- In-memory tuples (HeapTuple) were enriched with copies of xid_base and
multi_base from a page.

I think we need more Overview of Behavior than is available with this
patch, perhaps in the form of a README, such as in
src/backend/access/heap/README.HOT.

Most people's comments are about what the opportunities and problems
caused, and mine are definitely there also. i.e. explain the user
visible behavior.
Please explain the various new states that pages can be in and what
the effects are,

My understanding is this would be backwards compatible, so we can
upgrade to it. Please confirm.

Thanks

--
Simon Riggs http://www.EnterpriseDB.com/

Bruce Momjian

bruce@momjian.us

almost 4 years ago

In reply to: Finnerty, Jim (#4)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Jan 4, 2022 at 10:22:50PM +0000, Finnerty, Jim wrote:

I'm concerned about the maintainability impact of having 2 new
on-disk page formats. It's already complex enough with XIDs and
multixact-XIDs.

If the lack of space for the two epochs in the special data area is
a problem only in an upgrade scenario, why not resolve the problem
before completing the upgrade process like a kind of post-process
pg_repack operation that converts all "double xmax" pages to
the "double-epoch" page format? i.e. maybe the "double xmax"
representation is needed as an intermediate representation during
upgrade, but after upgrade completes successfully there are no pages
with the "double-xmax" representation. This would eliminate a whole
class of coding errors and would make the code dealing with 64-bit
XIDs simpler and more maintainable.

Well, yes, we could do this, and it would avoid the complexity of having
to support two XID representations, but we would need to accept that
fast pg_upgrade would be impossible in such cases, since every page
would need to be checked and potentially updated.

You might try to do this while the server is first started and running
queries, but I think we found out from the online checkpoint patch that
having the server in an intermediate state while running queries is very
complex --- it might be simpler to just accept two XID formats all the
time than enabling the server to run with two formats for a short
period. My big point is that this needs more thought.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

Bruce Momjian

bruce@momjian.us

almost 4 years ago

In reply to: Maxim Orlov (#1)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, Dec 30, 2021 at 03:15:16PM +0300, Maxim Orlov wrote:

PFA updated working patch v6 for PG15 development cycle.
It is based on a patch by Alexander Korotkov version 5 [5] with a few fixes,
refactoring and was rebased to PG15.

Main changes:
- Change TransactionId to 64bit
- Disk tuple format (HeapTupleHeader) is unchanged: xmin and xmax
Â remains 32bit
-- 32bit xid is named ShortTransactionId now.
-- Exception: see "double xmax" format below.
- Heap page format is changed to contain xid and multixact base value,
Â tuple's xmin and xmax are offsets from.
-- xid_base and multi_base are stored as a page special data. PageHeader
Â Â remains unmodified.
-- If after upgrade page has no free space for special data, tuples are
Â Â converted to "double xmax" format: xmin became virtual
Â Â FrozenTransactionId, xmax occupies the whole 64bit.
Â Â Page converted to new format when vacuum frees enough space.

I think it is a great idea to allow the 64-XID to span the 32-bit xmin
and xmax fields when needed. It would be nice if we can get focus on
this feature so we are sure it gets into PG 15. Can we add this patch
incrementally so people can more easily analyze it?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

Bruce Momjian

bruce@momjian.us

almost 4 years ago

In reply to: Finnerty, Jim (#2)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Jan 4, 2022 at 05:49:07PM +0000, Finnerty, Jim wrote:

Hi Maxim,
Iâm glad to see that youâre trying to carry the 64-bit XID work forward. I
had not noticed that my earlier patch (also derived from Alexander Kortkovâs
patch) was responded to back in September. Perhaps we can merge some of the
code cleanup that it contained, such as using XID_FMT everywhere and creating a
type for the kind of page returned by TransactionIdToPage() to make the code
cleaner.

Is your patch functionally the same as the PostgresPro implementation? If
so, I think it would be helpful for everyoneâs understanding to read the
PostgresPro documentation on VACUUM. See in particular section âForced
shrinking pg_clog and pg_multixactâ

https://postgrespro.com/docs/enterprise/9.6/routine-vacuuming#
vacuum-for-wraparound

Good point --- we still need vacuum freeze. It would be good to
understand how much value we get in allowing vacuum freeze to be done
less often --- how big can pg_xact/pg_multixact get before they are
problems?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#10

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Bruce Momjian (#7)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Jan 05, 2022 at 06:51:37PM -0500, Bruce Momjian wrote:

On Tue, Jan 4, 2022 at 10:22:50PM +0000, Finnerty, Jim wrote:

I'm concerned about the maintainability impact of having 2 new
on-disk page formats. It's already complex enough with XIDs and
multixact-XIDs.

If the lack of space for the two epochs in the special data area is
a problem only in an upgrade scenario, why not resolve the problem
before completing the upgrade process like a kind of post-process
pg_repack operation that converts all "double xmax" pages to
the "double-epoch" page format? i.e. maybe the "double xmax"
representation is needed as an intermediate representation during
upgrade, but after upgrade completes successfully there are no pages
with the "double-xmax" representation. This would eliminate a whole
class of coding errors and would make the code dealing with 64-bit
XIDs simpler and more maintainable.

Well, yes, we could do this, and it would avoid the complexity of having
to support two XID representations, but we would need to accept that
fast pg_upgrade would be impossible in such cases, since every page
would need to be checked and potentially updated.

You might try to do this while the server is first started and running
queries, but I think we found out from the online checkpoint patch that

I think you meant the online checksum patch. Which this reminded me of, too.

https://commitfest.postgresql.org/31/2611/

having the server in an intermediate state while running queries is very
complex --- it might be simpler to just accept two XID formats all the
time than enabling the server to run with two formats for a short
period. My big point is that this needs more thought.

--
Justin

#11

Bruce Momjian

bruce@momjian.us

almost 4 years ago

In reply to: Justin Pryzby (#10)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Jan 5, 2022 at 06:12:26PM -0600, Justin Pryzby wrote:

On Wed, Jan 05, 2022 at 06:51:37PM -0500, Bruce Momjian wrote:

On Tue, Jan 4, 2022 at 10:22:50PM +0000, Finnerty, Jim wrote:

I'm concerned about the maintainability impact of having 2 new
on-disk page formats. It's already complex enough with XIDs and
multixact-XIDs.

If the lack of space for the two epochs in the special data area is
a problem only in an upgrade scenario, why not resolve the problem
before completing the upgrade process like a kind of post-process
pg_repack operation that converts all "double xmax" pages to
the "double-epoch" page format? i.e. maybe the "double xmax"
representation is needed as an intermediate representation during
upgrade, but after upgrade completes successfully there are no pages
with the "double-xmax" representation. This would eliminate a whole
class of coding errors and would make the code dealing with 64-bit
XIDs simpler and more maintainable.

Well, yes, we could do this, and it would avoid the complexity of having
to support two XID representations, but we would need to accept that
fast pg_upgrade would be impossible in such cases, since every page
would need to be checked and potentially updated.

You might try to do this while the server is first started and running
queries, but I think we found out from the online checkpoint patch that

I think you meant the online checksum patch. Which this reminded me of, too.

https://commitfest.postgresql.org/31/2611/

Sorry, yes, I have checkpoint on my mind. ;-)

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#12

Alexander Korotkov

aekorotkov@gmail.com

almost 4 years ago

In reply to: Bruce Momjian (#8)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

On Thu, Jan 6, 2022 at 3:02 AM Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Dec 30, 2021 at 03:15:16PM +0300, Maxim Orlov wrote:

PFA updated working patch v6 for PG15 development cycle.
It is based on a patch by Alexander Korotkov version 5 [5] with a few fixes,
refactoring and was rebased to PG15.

Main changes:
- Change TransactionId to 64bit
- Disk tuple format (HeapTupleHeader) is unchanged: xmin and xmax
remains 32bit
-- 32bit xid is named ShortTransactionId now.
-- Exception: see "double xmax" format below.
- Heap page format is changed to contain xid and multixact base value,
tuple's xmin and xmax are offsets from.
-- xid_base and multi_base are stored as a page special data. PageHeader
remains unmodified.
-- If after upgrade page has no free space for special data, tuples are
converted to "double xmax" format: xmin became virtual
FrozenTransactionId, xmax occupies the whole 64bit.
Page converted to new format when vacuum frees enough space.

I think it is a great idea to allow the 64-XID to span the 32-bit xmin
and xmax fields when needed. It would be nice if we can get focus on
this feature so we are sure it gets into PG 15. Can we add this patch
incrementally so people can more easily analyze it?

I see at least the following major issues/questions in this patch.
1) Current code relies on the fact that TransactionId can be
atomically read from/written to shared memory. With 32-bit systems
and 64-bit TransactionId, that's not true anymore. Therefore, the
patch has concurrency issues on 32-bit systems. We need to carefully
review these issues and provide a fallback for 32-bit systems. I
suppose nobody is thinking about dropping off 32-bit systems, right?
Also, I wonder how critical for us is the overhead for 32-bit systems.
They are going to become legacy, so overhead isn't so critical, right?
2) With this patch we still need to freeze to cut SLRUs. This is
especially problematic with Multixacts, because systems heavily using
row-level locks can consume an enormous amount of multixacts. That is
especially problematic when we have 2x bigger multixacts. We probably
can provide an alternative implementation for multixact vacuum, which
doesn't require scanning all the heaps. That is a pretty amount of
work though. The clog is much smaller and we can cut it more rarely.
Perhaps, we could tolerate freezing to cut clog, couldn't we?
3) 2x bigger in-memory representation of TransactionId have overheads.
In particular, it could mitigate the effect of recent advancements
from Andres Freund. I'm not exactly sure we should/can do something
with this. But I think this should be at least carefully analyzed.
4) SP-GiST index stores TransactionId on pages. Could we tolerate
dropping SP-GiST indexes on a major upgrade? Or do we have to invent
something smarter?
5) 32-bit limitation within the page mentioned upthread by Stephen
Frost should be also carefully considered. Ideally, we should get rid
of it, but I don't have particular ideas in this field for now. At
least, we should make sure we did our best at error reporting and
monitoring capabilities.

I think the realistic goal for PG 15 development cycle would be
agreement on a roadmap for all the items above (and probably some
initial implementations).

------
Regards,
Alexander Korotkov

#13

Simon Riggs

simon.riggs@enterprisedb.com

almost 4 years ago

In reply to: Maxim Orlov (#1)

2 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, 30 Dec 2021 at 13:19, Maxim Orlov <orlovmg@gmail.com> wrote:

Your opinions are very much welcome!

This is a review of the Int64 options patch,
"v6-0001-Add-64-bit-GUCs-for-xids.patch"

Applies cleanly, with some fuzz, compiles cleanly and passes make check.
Patch eyeballs OK, no obvious defects.
Tested using the attached test, so seems to work correctly.
On review of docs, no additions or changes required.
Perhaps add something to README? If so, minor doc patch attached.

Otherwise, this sub-patch is READY FOR COMMITTER.

--
Simon Riggs http://www.EnterpriseDB.com/

#14

Finnerty, Jim

jfinnert@amazon.com

almost 4 years ago

In reply to: Finnerty, Jim (#4)

Re: Add 64-bit XIDs into PostgreSQL 15

(Maxim) Re: -- If after upgrade page has no free space for special data, tuples are
converted to "double xmax" format: xmin became virtual
FrozenTransactionId, xmax occupies the whole 64bit.
Page converted to new format when vacuum frees enough space.

A better way would be to prepare the database for conversion to the 64-bit XID format before the upgrade so that it ensures that every page has enough room for the two new epochs (E bits).

1. Enforce the rule that no INSERT or UPDATE to an existing page will leave less than E bits of free space on a heap page

2. Run an online and restartable task, analogous to pg_repack, that rewrites and splits any page that has less than E bits of free space. This needs to be run on all non-temp tables in all schemas in all databases. DDL operations are not allowed on a target table while this operation runs, which is enforced by taking an ACCESS SHARE lock on each table while the process is running. To mitigate the effects of this restriction, the restartable task can be restricted to run only in certain hours. This could be implemented as a background maintenance task that runs for X hours as of a certain time of day and then kicks itself off again in 24-X hours, logging its progress.

When this task completes, the database is ready for upgrade to 64-bit XIDs, and there is no possibility that any page has insufficient free space for the special data.

Would you agree that this approach would completely eliminate the need for a "double xmax" representation?

/Jim

#15

Simon Riggs

simon.riggs@enterprisedb.com

almost 4 years ago

In reply to: Finnerty, Jim (#14)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, 6 Jan 2022 at 13:15, Finnerty, Jim <jfinnert@amazon.com> wrote:

(Maxim) Re: -- If after upgrade page has no free space for special data, tuples are
converted to "double xmax" format: xmin became virtual
FrozenTransactionId, xmax occupies the whole 64bit.
Page converted to new format when vacuum frees enough space.

A better way would be to prepare the database for conversion to the 64-bit XID format before the upgrade so that it ensures that every page has enough room for the two new epochs (E bits).

Most software has a one-stage upgrade model. What you propose would
have us install 2 things, with a step in-between, which makes it
harder to manage.

1. Enforce the rule that no INSERT or UPDATE to an existing page will leave less than E bits of free space on a heap page

2. Run an online and restartable task, analogous to pg_repack, that rewrites and splits any page that has less than E bits of free space. This needs to be run on all non-temp tables in all schemas in all databases. DDL operations are not allowed on a target table while this operation runs, which is enforced by taking an ACCESS SHARE lock on each table while the process is running. To mitigate the effects of this restriction, the restartable task can be restricted to run only in certain hours. This could be implemented as a background maintenance task that runs for X hours as of a certain time of day and then kicks itself off again in 24-X hours, logging its progress.

When this task completes, the database is ready for upgrade to 64-bit XIDs, and there is no possibility that any page has insufficient free space for the special data.

Would you agree that this approach would completely eliminate the need for a "double xmax" representation?

I agree about the idea for scanning existing data blocks, but why not
do this AFTER upgrade?

1. Upgrade, with important aspect not-enabled-yet, but everything else
working - all required software is delivered in one shot, with fast
upgrade
2. As each table is VACUUMed, we confirm/clean/groom data blocks so
each table is individually confirmed as being ready. The pace that
this happens at is under user control.
3. When all tables have been prepared, then restart to allow xid64 format usage

--
Simon Riggs http://www.EnterpriseDB.com/

#16

Finnerty, Jim

jfinnert@amazon.com

almost 4 years ago

In reply to: Simon Riggs (#15)

Re: Add 64-bit XIDs into PostgreSQL 15

Re: Most software has a one-stage upgrade model. What you propose would
have us install 2 things, with a step in-between, which makes it
harder to manage.

The intended benefit would be that the code doesn't need to handle the possibility of 2 different XID representations for the indefinite future.

I agree that VACUUM would be the preferred tool to make room for the special data area so that there is no need to install a separate tool, though, whether this work happens before or after the upgrade.

Re: 1. Upgrade, with important aspect not-enabled-yet, but everything else working - all required software is delivered in one shot, with fast upgrade

Let's clarify what happens during upgrade. What format are the pages in immediately after the upgrade?

2. As each table is VACUUMed, we confirm/clean/groom data blocks so
each table is individually confirmed as being ready. The pace that
this happens at is under user control.

What are VACUUM's new responsibilities in this phase? VACUUM needs a new task that confirms when there exists no heap page for a table that is not ready.

If upgrade put all the pages into either double-xmax or double-epoch representation, then VACUUM's responsibility could be to split the double-xmax pages into the double-epoch representation and verify when there exists no double-xmax pages.

3. When all tables have been prepared, then restart to allow xid64 format usage

Let's also clarify what happens at restart time.

If we were to do the upgrade before preparing space in advance, is there a way to ever remove the code that knows about the double-xmax XID representation?

#17

Peter Geoghegan

pg@bowt.ie

almost 4 years ago

In reply to: Fujii Masao (#5)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Jan 4, 2022 at 9:40 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

Could you tell me what happens if new tuple with XID larger than xid_base + 0xFFFFFFFF is inserted into the page? Such new tuple is not allowed to be inserted into that page?

I fear that this patch will have many bugs along these lines. Example:
Why is it okay that convert_page() may have to defragment a heap page,
without holding a cleanup lock? That will subtly break code that holds
a pin on the buffer, when a tuple slot contains a C pointer to a
HeapTuple in shared memory (though only if we get unlucky).

Currently we are very permissive about what XID/backend can (or
cannot) consume a specific piece of free space from a specific heap
page in code like RelationGetBufferForTuple(). It would be very hard
to enforce a policy like "your XID cannot insert onto this particular
heap page" with the current FSM design. (I actually think that we
should have a FSM that supports these requirements, but that's a big
project -- a long term goal of mine [1]/messages/by-id/CAH2-Wz=zEV4y_wxh-A_EvKxeAoCMdquYMHABEh_kZO1rk3a-gw@mail.gmail.com -- Peter Geoghegan.)

Or xid_base and xids of all existing tuples in the page are increased? Also what happens if one of those xids (of existing tuples) cannot be changed because the tuple still can be seen by very-long-running transaction?

That doesn't work in the general case, I think. How could it, unless
we truly had 64-bit XIDs in heap tuple headers? You can't necessarily
freeze to fix the problem, because we can only freeze XIDs that 1.)
committed, and 2.) are visible to every possible MVCC snapshot. (I
think you were alluding to this problem yourself.)

I believe that a good solution to the problem that this patch tries to
solve needs to be more ambitious. I think that we need to return to
first principles, rather than extending what we have already.
Currently, we store XIDs in tuple headers so that we can determine the
tuple's visibility status, based on whether the XID committed (or
aborted), and where our snapshot sees the XID as "in the past" (in the
case of an inserted tuple's xmin). You could say that XIDs from tuple
headers exist so we can see differences *between* tuples. But these
differences are typically not useful/interesting for very long. 32-bit
XIDs are sometimes not wide enough, but usually they're "too wide":
Why should we need to consider an old XID (e.g. do clog lookups) at
all, barring extreme cases?

Why do we need to keep any kind of metadata about transactions around
for a long time? Postgres has not supported time travel in 25 years!

If we eagerly cleaned-up aborted transactions with a special kind of
VACUUM (which would remove aborted XIDs), we could also maintain a
structure that indicates if all of the XIDs on a heap page are known
"all committed" implicitly (no dirtying the page, no hint bits, etc)
-- something a little like the visibility map, that is mostly set
implicitly (not during VACUUM). That doesn't fix the wraparound
problem itself, of course. But it enables a design that imposes the
same problem on the specific old snapshot instead -- something like a
"snapshot too old" error is much better than a system-wide wraparound
failure. That approach is definitely very hard, and also requires a
smart FSM along the lines described in [1]/messages/by-id/CAH2-Wz=zEV4y_wxh-A_EvKxeAoCMdquYMHABEh_kZO1rk3a-gw@mail.gmail.com -- Peter Geoghegan, but it seems like the best
way forward.

As I pointed out already, freezing is bad because it imposes the
requirement that everybody considers an affected XID committed and
visible, which is brittle (e.g., old snapshots can cause wraparound
failure). More generally, we rely too much on explicitly maintaining
"absolute" metadata inline, when we should implicitly maintain
"relative" metadata (that can be discarded quickly and without concern
for old snapshots). We need to be more disciplined about what XIDs can
modify what heap pages in the first place (in code like hio.c and the
FSM) to make all this work.

[1]: /messages/by-id/CAH2-Wz=zEV4y_wxh-A_EvKxeAoCMdquYMHABEh_kZO1rk3a-gw@mail.gmail.com -- Peter Geoghegan
--
Peter Geoghegan

#18

Alexander Korotkov

aekorotkov@gmail.com

almost 4 years ago

In reply to: Finnerty, Jim (#14)

Re: Add 64-bit XIDs into PostgreSQL 15

,On Thu, Jan 6, 2022 at 4:15 PM Finnerty, Jim <jfinnert@amazon.com> wrote:

(Maxim) Re: -- If after upgrade page has no free space for special data, tuples are
converted to "double xmax" format: xmin became virtual
FrozenTransactionId, xmax occupies the whole 64bit.
Page converted to new format when vacuum frees enough space.

A better way would be to prepare the database for conversion to the 64-bit XID format before the upgrade so that it ensures that every page has enough room for the two new epochs (E bits).

1. Enforce the rule that no INSERT or UPDATE to an existing page will leave less than E bits of free space on a heap page

2. Run an online and restartable task, analogous to pg_repack, that rewrites and splits any page that has less than E bits of free space. This needs to be run on all non-temp tables in all schemas in all databases. DDL operations are not allowed on a target table while this operation runs, which is enforced by taking an ACCESS SHARE lock on each table while the process is running. To mitigate the effects of this restriction, the restartable task can be restricted to run only in certain hours. This could be implemented as a background maintenance task that runs for X hours as of a certain time of day and then kicks itself off again in 24-X hours, logging its progress.

When this task completes, the database is ready for upgrade to 64-bit XIDs, and there is no possibility that any page has insufficient free space for the special data.

Would you agree that this approach would completely eliminate the need for a "double xmax" representation?

The "prepare" approach was the first tried.
https://github.com/postgrespro/pg_pageprep
But it appears to be very difficult and unreliable. After investing
many months into pg_pageprep, "double xmax" approach appears to be
very fast to implement and reliable.

------
Regards,
Alexander Korotkov

#19

Fujii Masao

masao.fujii@oss.nttdata.com

almost 4 years ago

In reply to: Simon Riggs (#13)

Re: Add 64-bit XIDs into PostgreSQL 15

On 2022/01/06 19:24, Simon Riggs wrote:

On Thu, 30 Dec 2021 at 13:19, Maxim Orlov <orlovmg@gmail.com> wrote:

Your opinions are very much welcome!

This is a review of the Int64 options patch,
"v6-0001-Add-64-bit-GUCs-for-xids.patch"

Do we really need to support both int32 and int64 options? Isn't it enough to replace the existing int32 option with int64 one? Or how about using string-type option for very large number like 64-bit XID, like it's done for recovery_target_xid?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#20

Finnerty, Jim

jfinnert@amazon.com

almost 4 years ago

In reply to: Fujii Masao (#19)

Re: Add 64-bit XIDs into PostgreSQL 15

Re: clog page numbers, as returned by TransactionIdToPage

-	int			pageno = TransactionIdToPage(xid);	/* get page of parent */
+	int64		pageno = TransactionIdToPage(xid);	/* get page of parent */

...

-	int			pageno = TransactionIdToPage(subxids[0]);
+	int64		pageno = TransactionIdToPage(subxids[0]);
 	int			offset = 0;
 	int			i = 0;

...

-		int			nextpageno;
+		int64		nextpageno;

Etc.

In all those places where you are replacing int with int64 for the kind of values returned by TransactionIdToPage(), would you mind replacing the int64's with a type name, such as ClogPageNumber, for improved code maintainability?

#21

Finnerty, Jim

jfinnert@amazon.com

almost 4 years ago

In reply to: Alexander Korotkov (#18)

Re: Add 64-bit XIDs into PostgreSQL 15

Re: The "prepare" approach was the first tried.
https://github.com/postgrespro/pg_pageprep
But it appears to be very difficult and unreliable. After investing
many months into pg_pageprep, "double xmax" approach appears to be
very fast to implement and reliable.

I'd still like a plan to retire the "double xmax" representation eventually. Previously I suggested that this could be done as a post-process, before upgrade is complete, but that could potentially make upgrade very slow.

Another way to retire the "double xmax" representation eventually could be to disallow "double xmax" pages in subsequent major version upgrades (e.g. to PG16, if "double xmax" pages are introduced in PG15). This gives the luxury of time after a fast upgrade to convert all pages to contain the epochs, while still providing a path to more maintainable code in the future.

#22

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Finnerty, Jim (#21)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, Jan 07, 2022 at 03:53:51PM +0000, Finnerty, Jim wrote:

I'd still like a plan to retire the "double xmax" representation eventually. Previously I suggested that this could be done as a post-process, before upgrade is complete, but that could potentially make upgrade very slow.

Another way to retire the "double xmax" representation eventually could be to disallow "double xmax" pages in subsequent major version upgrades (e.g. to PG16, if "double xmax" pages are introduced in PG15). This gives the luxury of time after a fast upgrade to convert all pages to contain the epochs, while still providing a path to more maintainable code in the future.

Yes, but how are you planning to rewrite it? Is vacuum enough?
I suppose it'd need FREEZE + DISABLE_PAGE_SKIPPING ?

This would preclude upgrading "across" v15. Maybe that'd be okay, but it'd be
a new and atypical restriction.

How would you enforce that it'd been run on v15 before upgrading to pg16 ?

You'd need to track whether vacuum had completed the necessary steps in pg15.
I don't think it'd be okay to make pg_upgrade --check to read every tuple.

The "keeping track" part is what reminds me of the online checksum patch.
It'd be ideal if there were a generic solution to this kind of task, or at
least a "model" process to follow.

--
Justin

#23

Peter Eisentraut

peter.eisentraut@enterprisedb.com

almost 4 years ago

In reply to: Fujii Masao (#19)

Re: Add 64-bit XIDs into PostgreSQL 15

On 07.01.22 06:18, Fujii Masao wrote:

On 2022/01/06 19:24, Simon Riggs wrote:

On Thu, 30 Dec 2021 at 13:19, Maxim Orlov <orlovmg@gmail.com> wrote:

Your opinions are very much welcome!

This is a review of the Int64 options patch,
"v6-0001-Add-64-bit-GUCs-for-xids.patch"

Do we really need to support both int32 and int64 options? Isn't it
enough to replace the existing int32 option with int64 one?

I think that would create a lot of problems. You'd have to change every
underlying int variable to int64, and then check whether that causes any
issues where they are used (wrong printf format, assignments,
overflows), and you'd have to check whether the existing limits are
still appropriate. And extensions would be upset. This would be a big
mess.

Or how about
using string-type option for very large number like 64-bit XID, like
it's done for recovery_target_xid?

Seeing how many variables that contain transaction ID information
actually exist, I think it could be worth introducing a new category as
proposed. Otherwise, you'd have to write a lot of check and assign hooks.

I do wonder whether signed vs. unsigned is handled correctly.
Transaction IDs are unsigned, but all GUC handling is signed.

#24

Bruce Momjian

bruce@momjian.us

almost 4 years ago

In reply to: Finnerty, Jim (#21)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, Jan 7, 2022 at 03:53:51PM +0000, Finnerty, Jim wrote:

Re: The "prepare" approach was the first tried.
https://github.com/postgrespro/pg_pageprep But it appears to be
very difficult and unreliable. After investing many months into
pg_pageprep, "double xmax" approach appears to be very fast to
implement and reliable.

I'd still like a plan to retire the "double xmax" representation
eventually. Previously I suggested that this could be done as a
post-process, before upgrade is complete, but that could potentially
make upgrade very slow.

Another way to retire the "double xmax" representation eventually
could be to disallow "double xmax" pages in subsequent major version
upgrades (e.g. to PG16, if "double xmax" pages are introduced in
PG15). This gives the luxury of time after a fast upgrade to convert
all pages to contain the epochs, while still providing a path to more
maintainable code in the future.

This gets into the entire issue we have discussed in the past but never
resolved --- how do we manage state changes in the Postgres file format
while the server is running? pg_upgrade and pg_checksums avoid the
problem by doing such changes while the server is down, and other file
formats have avoided it by allowing perpetual reading of the old format.

Any such non-perpetual changes while the server is running must deal
with recording the start of the state change, the completion of it,
communicating such state changes to all running backends in a
synchronous manner, and possible restarting of the state change.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#25

Simon Riggs

simon.riggs@enterprisedb.com

almost 4 years ago

In reply to: Justin Pryzby (#22)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, 7 Jan 2022 at 16:09, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Fri, Jan 07, 2022 at 03:53:51PM +0000, Finnerty, Jim wrote:

I'd still like a plan to retire the "double xmax" representation eventually. Previously I suggested that this could be done as a post-process, before upgrade is complete, but that could potentially make upgrade very slow.

Another way to retire the "double xmax" representation eventually could be to disallow "double xmax" pages in subsequent major version upgrades (e.g. to PG16, if "double xmax" pages are introduced in PG15). This gives the luxury of time after a fast upgrade to convert all pages to contain the epochs, while still providing a path to more maintainable code in the future.

Yes, but how are you planning to rewrite it? Is vacuum enough?

Probably not, but VACUUM is the place to add such code.

I suppose it'd need FREEZE + DISABLE_PAGE_SKIPPING ?

Yes

This would preclude upgrading "across" v15. Maybe that'd be okay, but it'd be
a new and atypical restriction.

I don't see that restriction. Anyone upgrading from before PG15 would
apply the transform. Just because we introduce a transform in PG15
doesn't mean we can't apply that transform in later releases as well,
to allow say PG14 -> PG16.

--
Simon Riggs http://www.EnterpriseDB.com/

#26

Robert Haas

robertmhaas@gmail.com

almost 4 years ago

In reply to: Peter Geoghegan (#17)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, Jan 6, 2022 at 3:45 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Jan 4, 2022 at 9:40 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

Could you tell me what happens if new tuple with XID larger than xid_base + 0xFFFFFFFF is inserted into the page? Such new tuple is not allowed to be inserted into that page?

I fear that this patch will have many bugs along these lines. Example:
Why is it okay that convert_page() may have to defragment a heap page,
without holding a cleanup lock? That will subtly break code that holds
a pin on the buffer, when a tuple slot contains a C pointer to a
HeapTuple in shared memory (though only if we get unlucky).

Yeah. I think it's possible that some approach along the lines of what
is proposed here can work, but quality of implementation is a big
issue. This stuff is not easy to get right. Another thing that I'm
wondering about is the "double xmax" representation. That requires
some way of distinguishing when that representation is in use. I'd be
curious to know where we found the bits for that -- the tuple header
isn't exactly replete with extra bit space.

Also, if we have an epoch of some sort that is included in new page
headers but not old ones, that adds branches to code that might
sometimes be quite hot. I don't know how much of a problem that is,
but it seems worth worrying about.

For all of that, I don't particularly agree with Jim Finnerty's idea
that we ought to solve the problem by forcing sufficient space to
exist in the page pre-upgrade. There are some advantages to such
approaches, but they make it really hard to roll out changes. You have
to roll out the enabling change first, wait until everyone is running
a release that supports it, and only then release the technology that
requires the additional page space. Since we don't put new features
into back-branches -- and an on-disk format change would be a poor
place to start -- that would make rolling something like this out take
many years. I think we'll be much happier putting all the complexity
in the new release.

--
Robert Haas
EDB: http://www.enterprisedb.com

#27

Robert Haas

robertmhaas@gmail.com

almost 4 years ago

In reply to: Alexander Korotkov (#12)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Jan 5, 2022 at 9:53 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

I see at least the following major issues/questions in this patch.
1) Current code relies on the fact that TransactionId can be
atomically read from/written to shared memory. With 32-bit systems
and 64-bit TransactionId, that's not true anymore. Therefore, the
patch has concurrency issues on 32-bit systems. We need to carefully
review these issues and provide a fallback for 32-bit systems. I
suppose nobody is thinking about dropping off 32-bit systems, right?

I think that's right. Not yet, anyway.

Also, I wonder how critical for us is the overhead for 32-bit systems.
They are going to become legacy, so overhead isn't so critical, right?

Agreed.

2) With this patch we still need to freeze to cut SLRUs. This is
especially problematic with Multixacts, because systems heavily using
row-level locks can consume an enormous amount of multixacts. That is
especially problematic when we have 2x bigger multixacts. We probably
can provide an alternative implementation for multixact vacuum, which
doesn't require scanning all the heaps. That is a pretty amount of
work though. The clog is much smaller and we can cut it more rarely.
Perhaps, we could tolerate freezing to cut clog, couldn't we?

Right. We can't let any of the SLRUs -- don't forget about stuff like
pg_subtrans, which is a multiple of the size of clog -- grow without
bound, even if it never forces a system shutdown. I'm not sure it's a
good idea to think about introducing new freezing mechanisms at the
same time as we're making other changes, though. Just removing the
possibility of a wraparound shutdown without changing any of the rules
about how and when we freeze would be a significant advancement. Other
changes could be left for future work.

3) 2x bigger in-memory representation of TransactionId have overheads.
In particular, it could mitigate the effect of recent advancements
from Andres Freund. I'm not exactly sure we should/can do something
with this. But I think this should be at least carefully analyzed.

Seems fair.

4) SP-GiST index stores TransactionId on pages. Could we tolerate
dropping SP-GiST indexes on a major upgrade? Or do we have to invent
something smarter?

Probably depends on how much work it is. SP-GiST indexes are not
mainstream, so I think we could at least consider breaking
compatibility, but it doesn't seem like a thing to do lightly.

5) 32-bit limitation within the page mentioned upthread by Stephen
Frost should be also carefully considered. Ideally, we should get rid
of it, but I don't have particular ideas in this field for now. At
least, we should make sure we did our best at error reporting and
monitoring capabilities.

I don't think I understand the thinking here. As long as we retain the
existing limit that the oldest running XID can't be more than 2
billion XIDs in the past, we can't ever need to throw an error. A new
page modification that finds very old XIDs on the page can always
escape trouble by pruning the page and freezing whatever old XIDs
survive pruning.

I would argue that it's smarter not to change the in-memory
representation of XIDs to 64-bit in places like the ProcArray. As you
mention in (4), that might hurt performance. But also, the benefit is
minimal. Nobody is really sad that they can't keep transactions open
forever. They are sad because the system has severe bloat and/or shuts
down entirely. Some kind of change along these lines can fix the
second of those problems, and that's progress.

I think the realistic goal for PG 15 development cycle would be
agreement on a roadmap for all the items above (and probably some
initial implementations).

+1. Trying to rush something through to commit is just going to result
in a bunch of bugs. We need to work through the issues carefully and
take the time to do it well.

--
Robert Haas
EDB: http://www.enterprisedb.com

#28

Stephen Frost

sfrost@snowman.net

almost 4 years ago

In reply to: Robert Haas (#27)

Re: Add 64-bit XIDs into PostgreSQL 15

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 5, 2022 at 9:53 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

5) 32-bit limitation within the page mentioned upthread by Stephen
Frost should be also carefully considered. Ideally, we should get rid
of it, but I don't have particular ideas in this field for now. At
least, we should make sure we did our best at error reporting and
monitoring capabilities.

I don't think I understand the thinking here. As long as we retain the
existing limit that the oldest running XID can't be more than 2
billion XIDs in the past, we can't ever need to throw an error. A new
page modification that finds very old XIDs on the page can always
escape trouble by pruning the page and freezing whatever old XIDs
survive pruning.

So we'll just fail such an old transaction? Or force a server restart?
or..? What if we try to signal that transaction and it doesn't go away?

I would argue that it's smarter not to change the in-memory
representation of XIDs to 64-bit in places like the ProcArray. As you
mention in (4), that might hurt performance. But also, the benefit is
minimal. Nobody is really sad that they can't keep transactions open
forever. They are sad because the system has severe bloat and/or shuts
down entirely. Some kind of change along these lines can fix the
second of those problems, and that's progress.

I brought up the concern that I did because I would be a bit sad if I
couldn't have a transaction open for a day on a very high rate system of
the type being discussed here. Would be fantastic if we had a solution
to that issue, but I get that reducing the need to vacuum and such would
be a really nice improvement even if we can't make long running
transactions work. Then again, if we do actually change the in-memory
bits- then maybe we could have such a long running transaction, provided
it didn't try to make an update to a page with really old xids on it,
which might be entirely reasonable in a lot of cases. I do still worry
about how we explain what the limitation here is and how to avoid
hitting it. Definitely seems like a 'gotcha' that people are going to
complain about, though hopefully not as much of one as the current cases
we hear about of vacuum falling behind and the system running out of
xids.

I think the realistic goal for PG 15 development cycle would be
agreement on a roadmap for all the items above (and probably some
initial implementations).

+1. Trying to rush something through to commit is just going to result
in a bunch of bugs. We need to work through the issues carefully and
take the time to do it well.

+1.

Thanks,

Stephen

#29

Corey Huinker

corey.huinker@gmail.com

almost 4 years ago

In reply to: Robert Haas (#26)

Re: Add 64-bit XIDs into PostgreSQL 15

I'd be
curious to know where we found the bits for that -- the tuple header
isn't exactly replete with extra bit space.

+1 - and can we somehow shoehorn in a version # into the new format so we
never have to look for spare bits again.

#30

Maxim Orlov

orlovmg@gmail.com

almost 4 years ago

In reply to: Finnerty, Jim (#2)

Re: Add 64-bit XIDs into PostgreSQL 15

Perhaps we can merge some of the code cleanup that it contained, such
as using XID_FMT everywhere and creating a type for the kind of page
returned by TransactionIdToPage() to make the code cleaner.

Agree, I think this is a good idea.

Is your patch functionally the same as the PostgresPro
implementation?

Yes, it is. It basically is PostgresPro implementation, not a concept or
smth.

--
Best regards,
Maxim Orlov.

#31

Maxim Orlov

orlovmg@gmail.com

almost 4 years ago

In reply to: Fujii Masao (#5)

Re: Add 64-bit XIDs into PostgreSQL 15

Is there any documentation or README explaining this whole 64-bit XID
mechanism?

There is none, unfortunately. I would come back to this later.

Could you tell me what happens if new tuple with XID larger than xid_base
+ 0xFFFFFFFF is inserted into the page? Such new tuple is not allowed to be
inserted into that page? Or xid_base and xids of all existing tuples in the
page are increased? Also what happens if one of those xids (of existing
tuples) cannot be changed because the tuple still can be seen by
very-long-running transaction?

All this mechanism is around heap_insert/heap_update by
calling heap_page_prepare_for_xid() and if it fails (due to tuple still
visible) error is raised. Also If xid_base shift is not viable, it will try
to remove old tuples.

--
Best regards,
Maxim Orlov.

#32

Simon Riggs

simon.riggs@enterprisedb.com

almost 4 years ago

In reply to: Maxim Orlov (#30)

Re: Add 64-bit XIDs into PostgreSQL 15

On Sat, 8 Jan 2022 at 08:21, Maxim Orlov <orlovmg@gmail.com> wrote:

Perhaps we can merge some of the code cleanup that it contained, such as using XID_FMT everywhere and creating a type for the kind of page returned by TransactionIdToPage() to make the code cleaner.

Agree, I think this is a good idea.

Looks to me like the best next actions would be:

1. Submit a patch that uses XID_FMT everywhere, as a cosmetic change.
This looks like it will reduce the main patch size considerably and
make it much less scary. That can be cleaned up and committed while we
discuss the main approach.

2. Write up the approach in a detailed README, so people can
understand the proposal and assess if there are problems. A few short
notes and a link back to old conversations isn't enough to allow wide
review and give confidence on such a major patch.

--
Simon Riggs http://www.EnterpriseDB.com/

#33

Finnerty, Jim

jfinnert@amazon.com

almost 4 years ago

In reply to: Simon Riggs (#32)

Re: Add 64-bit XIDs into PostgreSQL 15

Re: patch that uses XID_FMT everywhere ... to make the main patch much smaller

That's exactly what my previous patch did, plus the patch to support 64-bit GUCs.

Maxim, maybe it's still a good idea to isolate those two patches and submit them separately first, to reduce the size of the rest of the patch?

On 1/12/22, 8:28 AM, "Simon Riggs" <simon.riggs@enterprisedb.com> wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

On Sat, 8 Jan 2022 at 08:21, Maxim Orlov <orlovmg@gmail.com> wrote:

Perhaps we can merge some of the code cleanup that it contained, such as using XID_FMT everywhere and creating a type for the kind of page returned by TransactionIdToPage() to make the code cleaner.

Agree, I think this is a good idea.

Looks to me like the best next actions would be:

--
Simon Riggs http://www.EnterpriseDB.com/

#34

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Finnerty, Jim (#33)

Re: Add 64-bit XIDs into PostgreSQL 15

Maxim, maybe it's still a good idea to isolate those two patches and
submit them separately first, to reduce the size of the rest of the patch?

Looks to me like the best next actions would be:

1. Submit a patch that uses XID_FMT everywhere, as a cosmetic change.
This looks like it will reduce the main patch size considerably and
make it much less scary. That can be cleaned up and committed while we
discuss the main approach.

2. Write up the approach in a detailed README, so people can
understand the proposal and assess if there are problems. A few short
notes and a link back to old conversations isn't enough to allow wide
review and give confidence on such a major patch.

Big thanks to all for your ideas!

We intend to do the following work on the patch soon:
1. Write a detailed README
2. Split the patch into several pieces including a separate part for
XID_FMT. But if committers meanwhile choose to commit Jim's XID_FMT patch
we also appreciate this and will rebase our patch accordingly.
2A. Probably refactor it to store precalculated XMIN/XMAX in memory
tuple representation instead of t_xid_base/t_multi_base
2B. Split the in-memory part of a patch as a separate
3. Construct some variants for leaving "double xmax" format as a temporary
one just after upgrade for having only one persistent on-disk format
instead of two.
3A. By using SQL function "vacuum doublexmax;"
OR
3B. By freeing space on all heap pages for pd_special before
pg-upgrade.
OR
3C. By automatically repacking all "double xmax" pages after upgrade
(with a priority specified by common vacuum-related GUCs)
4. Intentionally prohibit starting a new transaction with XID difference of
more than 2^32 from the oldest currently running one. This is to enforce
some dba's action for cleaning defunct transaction but not binding one:
he/she can wait if they consider these old transactions not defunct.
5. Investigate and add a solution for archs without 64-bit atomic values.
5A. Provide XID 8-byte alignment for systems where 64-bit atomics is
provided for 8-byte aligned values.
5B. Wrap XID reading into PG atomic locks for remaining 32-bit ones
(they are expected to be rare).

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#35

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Pavel Borisov (#34)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

We intend to do the following work on the patch soon:
1. Write a detailed README
2. Split the patch into several pieces including a separate part for
XID_FMT. But if committers meanwhile choose to commit Jim's XID_FMT patch
we also appreciate this and will rebase our patch accordingly.
2A. Probably refactor it to store precalculated XMIN/XMAX in memory
tuple representation instead of t_xid_base/t_multi_base
2B. Split the in-memory part of a patch as a separate
3. Construct some variants for leaving "double xmax" format as a temporary
one just after upgrade for having only one persistent on-disk format
instead of two.
3A. By using SQL function "vacuum doublexmax;"
OR
3B. By freeing space on all heap pages for pd_special before
pg-upgrade.
OR
3C. By automatically repacking all "double xmax" pages after upgrade
(with a priority specified by common vacuum-related GUCs)
4. Intentionally prohibit starting a new transaction with XID difference
of more than 2^32 from the oldest currently running one. This is to enforce
some dba's action for cleaning defunct transaction but not binding one:
he/she can wait if they consider these old transactions not defunct.
5. Investigate and add a solution for archs without 64-bit atomic values.
5A. Provide XID 8-byte alignment for systems where 64-bit atomics
is provided for 8-byte aligned values.
5B. Wrap XID reading into PG atomic locks for remaining 32-bit ones
(they are expected to be rare).

Hi, hackers!

PFA patch with README for 64xid proposal. It is 0003 patch of the same v6,
that was proposed earlier [1]/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com.
As always, I very much appreciate your ideas on this readme patch, on
overall 64xid patch [1]/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com, and on the roadmap on its improvement quoted above.

[1]: /messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com
/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#36

Julien Rouhaud

rjuju123@gmail.com

almost 4 years ago

In reply to: Pavel Borisov (#35)

3 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On Fri, Jan 14, 2022 at 11:38:46PM +0400, Pavel Borisov wrote:

PFA patch with README for 64xid proposal. It is 0003 patch of the same v6,
that was proposed earlier [1].
As always, I very much appreciate your ideas on this readme patch, on
overall 64xid patch [1], and on the roadmap on its improvement quoted above.

Thanks for adding this documentation! Unfortunately, the cfbot can't apply a
patchset split in multiple emails, so for now you only get coverage for this
new readme file, as this is what's being tested on the CI:
https://github.com/postgresql-cfbot/postgresql/commit/f8f12ce29344bc7c72665c334b5eb40cee22becd

Could you send the full patchset each time you make a modification? For now
I'm simply attaching 0001, 0002 and 0003 to make sure that the cfbot will pick
all current patches on its next run.

#37

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Maxim Orlov (#1)

Re: Add 64-bit XIDs into PostgreSQL 15

I tried to pg_upgrade from a v13 instance like:
time make check -C src/bin/pg_upgrade oldsrc=`pwd`/13 oldbindir=`pwd`/13/tmp_install/usr/local/pgsql/bin

I had compiled and installed v13 into `pwd`/13.

First, test.sh failed, because of an option in initdb which doesn't exist in
the old version: -x 21000000000

I patched test.sh so the option is used only the "new" version.

The tab_core_types table has an XID column, so pg_upgrade --check complains and
refuses to run. If I drop it, then pg_upgrade runs, but then fails like this:

|Files /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/new_xids.txt and /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/old_xids.txt differ
|See /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/xids.diff
|
|--- /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/new_xids.txt 2022-01-15 00:14:23.035294414 -0600
|+++ /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/old_xids.txt 2022-01-15 00:13:59.634945012 -0600
|@@ -1,5 +1,5 @@
| relfrozenxid | relminmxid
| --------------+------------
|- 3 | 3
|+ 15594 | 3
| (1 row)

Also, the patch needs to be rebased over Peter's vacuum changes.

Here's the changes I used for my test:

diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index c6361e3c085..5eae42192b6 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -24,7 +24,8 @@ standard_initdb() {
 	# without increasing test runtime, run these tests with a custom setting.
 	# Also, specify "-A trust" explicitly to suppress initdb's warning.
 	# --allow-group-access and --wal-segsize have been added in v11.
-	"$1" -N --wal-segsize 1 --allow-group-access -A trust -x 21000000000
+	"$@" -N --wal-segsize 1 --allow-group-access -A trust
+
 	if [ -n "$TEMP_CONFIG" -a -r "$TEMP_CONFIG" ]
 	then
 		cat "$TEMP_CONFIG" >> "$PGDATA/postgresql.conf"
@@ -237,7 +238,7 @@ fi

PGDATA="$BASE_PGDATA"

-standard_initdb 'initdb'
+standard_initdb 'initdb' -x 21000000000

pg_upgrade $PG_UPGRADE_OPTS --no-sync -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT"

diff --git a/src/bin/pg_upgrade/upgrade_adapt.sql b/src/bin/pg_upgrade/upgrade_adapt.sql
index 27c4c7fd011..c5ce8bc95b2 100644
--- a/src/bin/pg_upgrade/upgrade_adapt.sql
+++ b/src/bin/pg_upgrade/upgrade_adapt.sql
@@ -89,3 +89,5 @@ DROP OPERATOR public.#%# (pg_catalog.int8, NONE);
 DROP OPERATOR public.!=- (pg_catalog.int8, NONE);
 DROP OPERATOR public.#@%# (pg_catalog.int8, NONE);
 \endif
+
+DROP TABLE IF EXISTS tab_core_types;

#38

Ilya Anfimov

ilan@tzirechnoy.com

almost 4 years ago

In reply to: Bruce Momjian (#7)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Jan 05, 2022 at 06:51:37PM -0500, Bruce Momjian wrote:

On Tue, Jan 4, 2022 at 10:22:50PM +0000, Finnerty, Jim wrote:

[skipped]

with the "double-xmax" representation. This would eliminate a whole
class of coding errors and would make the code dealing with 64-bit
XIDs simpler and more maintainable.

Well, yes, we could do this, and it would avoid the complexity of having
to support two XID representations, but we would need to accept that
fast pg_upgrade would be impossible in such cases, since every page
would need to be checked and potentially updated.

You might try to do this while the server is first started and running
queries, but I think we found out from the online checkpoint patch that
having the server in an intermediate state while running queries is very
complex --- it might be simpler to just accept two XID formats all the
time than enabling the server to run with two formats for a short
period. My big point is that this needs more thought.

Probably, some table storage housekeeping would be wanted.

Like a column in pg_class describing the current set of options
of the table: checksums added, 64-bit xids added, type of 64-bit
xids (probably some would want to add support for the pgpro up-
grades), some set of defaults to not include a lot of them in all
pageheaders -- like compressed xid/integer formats or extended
pagesize.

And separate tables that describe the transition state --
like when adding checksums, the desired state for the relation
(checksums), and a set of ranges in the table files that are al-
ready transitioned/checked.

That probably will not introduce too much slowdown at least on
reading, and will add the transition/upgrade mechanics.

Aren't there were already some discussions about such a feature
in the mailing lists?

Show quoted text

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#39

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Ilya Anfimov (#38)

3 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, hackers!

Due to recent upstream changes patch v6 doesn't apply anymore. Rebased it
without major modifications. PFA v7 of a patch.

The next changes mentioned in [1]/messages/by-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m=BgCpNA@mail.gmail.com are upcoming.

Please feel free to discuss readme and your opinions on the current patch
and proposed changes [1]/messages/by-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m=BgCpNA@mail.gmail.com.

[1]: /messages/by-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m=BgCpNA@mail.gmail.com
/messages/by-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m=BgCpNA@mail.gmail.com

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#40

Aleksander Alekseev

aleksander@timescale.com

almost 4 years ago

In reply to: Pavel Borisov (#39)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Pavel,

Please feel free to discuss readme and your opinions on the current patch and proposed changes [1].

Just a quick question about this design choice:

On-disk tuple format remains unchanged. 32-bit t_xmin and t_xmax store the
lower parts of 64-bit XMIN and XMAX values. Each heap page has additional
64-bit pd_xid_base and pd_multi_base which are common for all tuples on a page.
They are placed into a pd_special area - 16 bytes in the end of a heap page.
Actual XMIN/XMAX for a tuple are calculated upon reading a tuple from a page
as follows:

XMIN = t_xmin + pd_xid_base.
XMAX = t_xmax + pd_xid_base/pd_multi_base.

Did you consider using 4 bytes for pd_xid_base and another 4 bytes for
(pd_xid_base/pd_multi_base)? This would allow calculating XMIN/XMAX
as:

XMIN = (t_min_extra_bits << 32) | t_xmin
XMAX = (t_max_extra_bits << 32) | t_xmax

... and save 8 extra bytes in the pd_special area. Or maybe I'm
missing some context here?

--
Best regards,
Aleksander Alekseev

#41

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Aleksander Alekseev (#40)

Re: Add 64-bit XIDs into PostgreSQL 15

Did you consider using 4 bytes for pd_xid_base and another 4 bytes for
(pd_xid_base/pd_multi_base)? This would allow calculating XMIN/XMAX
as:

XMIN = (t_min_extra_bits << 32) | t_xmin
XMAX = (t_max_extra_bits << 32) | t_xmax

... and save 8 extra bytes in the pd_special area. Or maybe I'm
missing some context here?

Hi, Alexander!

In current design it is not possible, as pd_xid_base is roughly just a
minimum 64-xid of all tuples that may fit this page. So we do not make any
extra guess that it should be in multiples of 2^32.

If we make pd_xid_base in multiples of 2^32 then after current XID crosses
the border of 2^32 then pages that contains tuples with XMIN/XMAX before
this point are not suitable for tuple inserts anymore. In effect we will
then have "sets" of the pages for each 2^32 "epoch" with freed space that
can not be used anymore.

I think it's too big a loss for gain of just 8 bytes per page.

Thank you for your dive into this matter!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#42

Andres Freund

andres@anarazel.de

almost 4 years ago

In reply to: Pavel Borisov (#39)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On 2022-01-24 16:38:54 +0400, Pavel Borisov wrote:

+64-bit Transaction ID's (XID)
+=============================
+
+A limited number (N = 2^32) of XID's required to do vacuum freeze to prevent
+wraparound every N/2 transactions. This causes performance degradation due
+to the need to exclusively lock tables while being vacuumed. In each
+wraparound cycle, SLRU buffers are also being cut.

What exclusive lock?

+"Double XMAX" page format
+---------------------------------
+
+At first read of a heap page after pg_upgrade from 32-bit XID PostgreSQL
+version pd_special area with a size of 16 bytes should be added to a page.
+Though a page may not have space for this. Then it can be converted to a
+temporary format called "double XMAX".

+All tuples after pg-upgrade would necessarily have xmin = FrozenTransactionId.

Why would a tuple after pg-upgrade necessarily have xmin =
FrozenTransactionId? A pg_upgrade doesn't scan the tables, so the pg_upgrade
itself doesn't do anything to xmins.

I guess you mean that the xmin cannot be needed anymore, because no older
transaction can be running?

+In-memory tuple format
+----------------------
+
+In-memory tuple representation consists of two parts:
+- HeapTupleHeader from disk page (contains all heap tuple contents, not only
+header)
+- HeapTuple with additional in-memory fields
+
+HeapTuple for each tuple in memory stores t_xid_base/t_multi_base - a copies of
+page's pd_xid_base/pd_multi_base. With tuple's 32-bit t_xmin and t_xmax from
+HeapTupleHeader they are used to calculate actual 64-bit XMIN and XMAX:
+
+XMIN = t_xmin + t_xid_base. 					(3)
+XMAX = t_xmax + t_xid_base/t_multi_base.		(4)

What identifies a HeapTuple as having this additional data?

+The downside of this is that we can not use tuple's XMIN and XMAX right away.
+We often need to re-read t_xmin and t_xmax - which could actually be pointers
+into a page in shared buffers and therefore they could be updated by any other
+backend.

Ugh, that's not great.

+Upgrade from 32-bit XID versions
+--------------------------------
+
+pg_upgrade doesn't change pages format itself. It is done lazily after.
+
+1. At first heap page read, tuples on a page are repacked to free 16 bytes
+at the end of a page, possibly freeing space from dead tuples.

That will cause a *massive* torrent of writes after an upgrade. Isn't this
practically making pg_upgrade useless? Imagine a huge cluster where most of
the pages are all-frozen, upgraded using link mode.

What happens if the first access happens on a replica?

What is the approach for dealing with multixact files? They have xids
embedded? And currently the SLRUs will break if you just let the offsets SLRU
grow without bounds.

+void
+convert_page(Relation rel, Page page, Buffer buf, BlockNumber blkno)
+{
+	PageHeader	hdr = (PageHeader) page;
+	GenericXLogState *state = NULL;
+	Page	tmp_page = page;
+	uint16	checksum;
+
+	if (!rel)
+		return;
+
+	/* Verify checksum */
+	if (hdr->pd_checksum)
+	{
+		checksum = pg_checksum_page((char *) page, blkno);
+		if (checksum != hdr->pd_checksum)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("page verification failed, calculated checksum %u but expected %u",
+							checksum, hdr->pd_checksum)));
+	}
+
+	/* Start xlog record */
+	if (!XactReadOnly && XLogIsNeeded() && RelationNeedsWAL(rel))
+	{
+		state = GenericXLogStart(rel);
+		tmp_page = GenericXLogRegisterBuffer(state, buf, GENERIC_XLOG_FULL_IMAGE);
+	}
+
+	PageSetPageSizeAndVersion((hdr), PageGetPageSize(hdr),
+							  PG_PAGE_LAYOUT_VERSION);
+
+	if (was_32bit_xid(hdr))
+	{
+		switch (rel->rd_rel->relkind)
+		{
+			case 'r':
+			case 'p':
+			case 't':
+			case 'm':
+				convert_heap(rel, tmp_page, buf, blkno);
+				break;
+			case 'i':
+				/* no need to convert index */
+			case 'S':
+				/* no real need to convert sequences */
+				break;
+			default:
+				elog(ERROR,
+					 "Conversion for relkind '%c' is not implemented",
+					 rel->rd_rel->relkind);
+		}
+	}
+
+	/*
+	 * Mark buffer dirty unless this is a read-only transaction (e.g. query
+	 * is running on hot standby instance)
+	 */
+	if (!XactReadOnly)
+	{
+		/* Finish xlog record */
+		if (XLogIsNeeded() && RelationNeedsWAL(rel))
+		{
+			Assert(state != NULL);
+			GenericXLogFinish(state);
+		}
+
+		MarkBufferDirty(buf);
+	}
+
+	hdr = (PageHeader) page;
+	hdr->pd_checksum = pg_checksum_page((char *) page, blkno);
+}

Wait. So you just modify the page without WAL logging or marking it dirty on a
standby? I fail to see how that can be correct.

Imagine the cluster is promoted, the page is dirtied, and we write it
out. You'll have written out a completely changed page, without any WAL
logging. There's plenty other scenarios.

Greetings,

Andres Freund

#43

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Andres Freund (#42)

3 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, Andres!

I've revised the README a little bit to address your corrections and
questions. Thanks for this very much!
A patchset with changed README is attached as v8 here (the code is
unchanged and identical to v7).

+The downside of this is that we can not use tuple's XMIN and XMAX right

away.

+We often need to re-read t_xmin and t_xmax - which could actually be

pointers

+into a page in shared buffers and therefore they could be updated by

any other

+backend.

Ugh, that's not great.

Agree. This part is one of the candidates for revision as per proposals
above [1]/messages/by-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m=BgCpNA@mail.gmail.com i.e :
"2A. Probably refactor it to store precalculated XMIN/XMAX in memory
tuple representation instead of t_xid_base/t_multi_base".

We are working on this change.

What happens if the first access happens on a replica?

What is the approach for dealing with multixact files? They have xids
embedded? And currently the SLRUs will break if you just let the offsets
SLRU
grow without bounds.

Wait. So you just modify the page without WAL logging or marking it dirty
on a
standby? I fail to see how that can be correct.

Imagine the cluster is promoted, the page is dirtied, and we write it
out. You'll have written out a completely changed page, without any WAL
logging. There's plenty other scenarios.

In this part, I suppose you've found a definite bug. Thanks! There are a
couple
of ways how it could be fixed:

1. If we enforce checkpoint at replica promotion then we force full-page
writes after each page modification afterward.

2. Maybe it's worth using BufferDesc bit to mark the page as converted to
64xid but not yet written to disk? For example, one of four bits from
BUF_USAGECOUNT.
BM_MAX_USAGE_COUNT = 5 so it will be enough 3 bits to store it. This will
change in-memory page representation but will not need WAL-logging which is
impossible on a replica.

What do you think about it?

[1]: /messages/by-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m=BgCpNA@mail.gmail.com
/messages/by-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m=BgCpNA@mail.gmail.com

#44

Aleksander Alekseev

aleksander@timescale.com

almost 4 years ago

In reply to: Pavel Borisov (#43)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

In this part, I suppose you've found a definite bug. Thanks! There are a couple
of ways how it could be fixed:

1. If we enforce checkpoint at replica promotion then we force full-page writes after each page modification afterward.

2. Maybe it's worth using BufferDesc bit to mark the page as converted to 64xid but not yet written to disk? For example, one of four bits from BUF_USAGECOUNT.
BM_MAX_USAGE_COUNT = 5 so it will be enough 3 bits to store it. This will change in-memory page representation but will not need WAL-logging which is impossible on a replica.

What do you think about it?

I'm having difficulties merging and/or testing
v8-0002-Add-64bit-xid.patch since I'm not 100% sure which commit this
patch was targeting. Could you please submit a rebased patch and/or
share your development branch on GitHub?

I agree with Bruce it would be great to deliver this in PG15. Please
let me know if you believe it's unrealistic for any reason so I will
focus on testing and reviewing other patches.

For now, I'm changing the status of the patch to "Waiting on Author".

--
Best regards,
Aleksander Alekseev

#45

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Maxim Orlov (#1)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers!

Hi! Here is the rebased version.

I'd like to add a description of what was done in v9:
- The patch is rebased on current master branch
- In-memory tuple storage format was refactored as promised to have
pre-calculated 64bit xmin and xmax, not just copies of pd_xid_base and
pd_multi_base.
- Fixed bug reported by Andres Freund, with lazy conversion of pages
upgraded from 32 to 64 xid when first tuple read (and therefore lazy
conversion) is done in read-only state (read-only xact or on replica). In
this case now in memory buffer descriptor will be marked
with REGBUF_CONVERTED flag. When cluster comes to read-write state this
will lead to emitting full page write xlog instruction for this page.

Relevant changes in README are also done.

We'd very much appreciate enthusiasm to have 64 bit xid's in PG15 and any
effort to review and test this feature.

Alexander, thanks for your attention to the patchset. Your questions and
review are very much welcome!
The participation of other hackers is highly appreciated as always!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

Import Notes

Reply to msg id not found: CACGezaaPcqBGdHCSbLBzWtFvQYp7g9uPYaa818+SxZQxcVZGw@mail.gmail.com

#46

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Pavel Borisov (#45)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Mar 02, 2022 at 06:43:11PM +0400, Pavel Borisov wrote:

Hi hackers!

Hi! Here is the rebased version.

The patch doesn't apply - I suppose the patch is relative a forked postgres
which already has other patches.

http://cfbot.cputube.org/pavel-borisov.html

Note also that I mentioned an issue with pg_upgrade. Handling that that well
is probably the most important part of the patch.

--
Justin

#47

Andres Freund

andres@anarazel.de

almost 4 years ago

In reply to: Aleksander Alekseev (#44)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On 2022-03-02 16:25:33 +0300, Aleksander Alekseev wrote:

I agree with Bruce it would be great to deliver this in PG15.

Please let me know if you believe it's unrealistic for any reason so I will
focus on testing and reviewing other patches.

I don't see 15 as a realistic target for this patch. There's huge amounts of
work left, it has gotten very little review.

I encourage trying to break down the patch into smaller incrementally useful
pieces. E.g. making all the SLRUs 64bit would be a substantial and
independently committable piece.

Greetings,

Andres Freund

#48

Aleksander Alekseev

aleksander@timescale.com

almost 4 years ago

In reply to: Andres Freund (#47)

3 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

The patch doesn't apply - I suppose the patch is relative a forked postgres

No, the authors just used a little outdated `master` branch. I
successfully applied it against 31d8d474 and then rebased to the
latest master (62ce0c75). The new version is attached.

Not 100% sure if my rebase is correct since I didn't invest too much
time into reviewing the code. But at least it passes `make
installcheck` locally. Let's see what cfbot will tell us.

I encourage trying to break down the patch into smaller incrementally useful
pieces. E.g. making all the SLRUs 64bit would be a substantial and
independently committable piece.

Completely agree. And the changes like:

+#if 0 /* XXX remove unit tests */

... suggest that the patch is pretty raw in its current state.

Pavel, Maxim, don't you mind me splitting the patchset, or would you
like to do it yourself and/or maybe include more changes? I don't know
how actively you are working on this.

--
Best regards,
Aleksander Alekseev

#49

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Aleksander Alekseev (#48)

Re: Add 64-bit XIDs into PostgreSQL 15

The patch doesn't apply - I suppose the patch is relative a forked

postgres

No, the authors just used a little outdated `master` branch. I
successfully applied it against 31d8d474 and then rebased to the
latest master (62ce0c75). The new version is attached.

Not 100% sure if my rebase is correct since I didn't invest too much

time into reviewing the code. But at least it passes `make
installcheck` locally. Let's see what cfbot will tell us.

Thank you very much! We'll do the same rebase and check this soon. Let's
use v10 now.

I encourage trying to break down the patch into smaller incrementally

useful

pieces. E.g. making all the SLRUs 64bit would be a substantial and
independently committable piece.

Completely agree. And the changes like:

+#if 0 /* XXX remove unit tests */

... suggest that the patch is pretty raw in its current state.

Pavel, Maxim, don't you mind me splitting the patchset, or would you
like to do it yourself and/or maybe include more changes? I don't know
how actively you are working on this.

I don't mind and appreciate you joining this. If you split this we'll just
make the next versions based on it. Of course, there is much to do and we
work on this patch, including pg_upgrade test fail reported by Justin,
which we haven't had time to concentrate on before. We try to do changes in
small batches so I consider we can manage parallel changes. At least I read
this thread very often and can answer soon, even if our new versions of
patches are not ready.

Again I consider the work you propose useful and big thanks to you,
Alexander!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#50

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Pavel Borisov (#49)

Re: Add 64-bit XIDs into PostgreSQL 15

BTW messages with patches in this thread are always invoke manual spam
moderation and we need to wait for ~3 hours before the message with patch
becomes visible in the hackers thread. Now when I've already answered
Alexander's letter with v10 patch the very message (and a patch) I've
answered is still not visible in the thread and to CFbot.

Can something be done in hackers' moderation engine to make new versions
patches become visible hassle-free?

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#51

Maxim Orlov

orlovmg@gmail.com

almost 4 years ago

In reply to: Pavel Borisov (#50)

3 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

We've rebased patchset onto the current master. The result is almost the
same as Alexander's v10 (it is a shame it is still in moderation and not
visible in the thread).
Anyway, this is the v11 patch. Reviews are very welcome.

--
Best regards,
Maxim Orlov.

#52

Alexander Korotkov

aekorotkov@gmail.com

almost 4 years ago

In reply to: Pavel Borisov (#50)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Pavel!

On Thu, Mar 3, 2022 at 2:35 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

BTW messages with patches in this thread are always invoke manual spam moderation and we need to wait for ~3 hours before the message with patch becomes visible in the hackers thread. Now when I've already answered Alexander's letter with v10 patch the very message (and a patch) I've answered is still not visible in the thread and to CFbot.

Can something be done in hackers' moderation engine to make new versions patches become visible hassle-free?

Is your email address subscribed to the pgsql-hackers mailing list?
AFAIK, moderation is only applied for non-subscribers.

------
Regards,
Alexander Korotkov

#53

Pavel Borisov

pashkin.elfe@gmail.com

almost 4 years ago

In reply to: Alexander Korotkov (#52)

Re: Add 64-bit XIDs into PostgreSQL 15

BTW messages with patches in this thread are always invoke manual spam

moderation and we need to wait for ~3 hours before the message with patch
becomes visible in the hackers thread. Now when I've already answered
Alexander's letter with v10 patch the very message (and a patch) I've
answered is still not visible in the thread and to CFbot.

Can something be done in hackers' moderation engine to make new versions

patches become visible hassle-free?

Is your email address subscribed to the pgsql-hackers mailing list?
AFAIK, moderation is only applied for non-subscribers.

Hi, Alexander!

Yes, it is in the list. The problem is that patch is over 1Mb. So it
strictly goes through moderation. And this is unchanged for 2 months
already.
I was advised to use .gz, which I will do next time.

I've requested increasing threshold to 2 MB [1]/messages/by-id/CALT9ZEGbAR84q_emsf1TUMPqXT=c8CxN16g-HQCxgkLzekM+Qg@mail.gmail.com -- Best regards, Pavel Borisov

[1]: /messages/by-id/CALT9ZEGbAR84q_emsf1TUMPqXT=c8CxN16g-HQCxgkLzekM+Qg@mail.gmail.com -- Best regards, Pavel Borisov
/messages/by-id/CALT9ZEGbAR84q_emsf1TUMPqXT=c8CxN16g-HQCxgkLzekM+Qg@mail.gmail.com
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#54

Stephen Frost

sfrost@snowman.net

almost 4 years ago

In reply to: Pavel Borisov (#53)

Re: Add 64-bit XIDs into PostgreSQL 15

Greetings,

* Pavel Borisov (pashkin.elfe@gmail.com) wrote:

BTW messages with patches in this thread are always invoke manual spam

moderation and we need to wait for ~3 hours before the message with patch
becomes visible in the hackers thread. Now when I've already answered
Alexander's letter with v10 patch the very message (and a patch) I've
answered is still not visible in the thread and to CFbot.

Can something be done in hackers' moderation engine to make new versions

patches become visible hassle-free?

Is your email address subscribed to the pgsql-hackers mailing list?
AFAIK, moderation is only applied for non-subscribers.

Yes, it is in the list. The problem is that patch is over 1Mb. So it
strictly goes through moderation. And this is unchanged for 2 months
already.

Right, >1MB will be moderated, as will emails that are CC'd to multiple
lists, and somehow this email thread ended up with two different
addresses for -hackers, which isn't good.

I was advised to use .gz, which I will do next time.

Better would be to break the patch down into reasonable and independent
pieces for review and commit on separate threads as suggested previously
and not to send huge patches to the list with the idea that someone is
going to actually fully review and commit them. That's just not likely
to end up working well anyway.

Thanks,

Stephen

#55

Aleksander Alekseev

aleksander@timescale.com

almost 4 years ago

In reply to: Stephen Frost (#54)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

We've rebased patchset onto the current master. The result is almost the
same as Alexander's v10 (it is a shame it is still in moderation and not
visible in the thread). Anyway, this is the v11 patch. Reviews are very
welcome.

Here is a rebased and slightly modified version of the patch.

I extracted the introduction of XID_FMT macro to a separate patch. Also,
I noticed that sometimes PRIu64 was used to format XIDs instead. I changed it
to XID_FMT for consistency. v12-0003 can be safely delivered in PG15.

I encourage trying to break down the patch into smaller incrementally useful
pieces. E.g. making all the SLRUs 64bit would be a substantial and
independently committable piece.

I'm going to address this in follow-up emails.

--
Best regards,
Aleksander Alekseev

#56

Aleksander Alekseev

aleksander@timescale.com

almost 4 years ago

In reply to: Aleksander Alekseev (#55)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

Here is a rebased and slightly modified version of the patch.

I extracted the introduction of XID_FMT macro to a separate patch. Also,
I noticed that sometimes PRIu64 was used to format XIDs instead. I changed it
to XID_FMT for consistency. v12-0003 can be safely delivered in PG15.

I encourage trying to break down the patch into smaller incrementally useful
pieces. E.g. making all the SLRUs 64bit would be a substantial and
independently committable piece.

I'm going to address this in follow-up emails.

cfbot is not happy because several files are missing in v12. Here is a
corrected and rebased version. I also removed the "#undef PRIu64"
change from include/c.h since previously I replaced PRIu64 usage with
XID_FMT.

--
Best regards,
Aleksander Alekseev

#57

Aleksander Alekseev

aleksander@timescale.com

almost 4 years ago

In reply to: Aleksander Alekseev (#56)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

I extracted the introduction of XID_FMT macro to a separate patch. Also,
I noticed that sometimes PRIu64 was used to format XIDs instead. I changed it
to XID_FMT for consistency. v12-0003 can be safely delivered in PG15.

[...]

I encourage trying to break down the patch into smaller incrementally useful
pieces. E.g. making all the SLRUs 64bit would be a substantial and
independently committable piece.

I'm going to address this in follow-up emails.

cfbot is not happy because several files are missing in v12. Here is a
corrected and rebased version. I also removed the "#undef PRIu64"
change from include/c.h since previously I replaced PRIu64 usage with
XID_FMT.

Here is a new version of the patchset. SLRU refactoring was moved to a
separate patch. Both v14-0003 (XID_FMT macro) and v14-0004 (SLRU
refactoring) can be delivered in PG15.

One thing I couldn't understand so far is why SLRU_PAGES_PER_SEGMENT
should necessarily be increased in order to make 64-bit XIDs work. I
kept the current value (32) in v14-0004 but changed it to 2048 in
./v14-0005 (where we start using 64 bit XIDs) as it was in the
original patch. Is this change really required?

--
Best regards,
Aleksander Alekseev

#58

Aleksander Alekseev

aleksander@timescale.com

almost 4 years ago

In reply to: Aleksander Alekseev (#57)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

Here is a new version of the patchset. SLRU refactoring was moved to a
separate patch. Both v14-0003 (XID_FMT macro) and v14-0004 (SLRU
refactoring) can be delivered in PG15.

Here is a new version of the patchset. The changes compared to v14 are
minimal. Most importantly, the GCC warning reported by cfbot was
(hopefully) fixed. The patch order was also altered, v15-0001 and
v15-0002 are targeting PG15 now, the rest are targeting PG16.

Also for the record, I tested the patchset on Raspberry Pi 3 Model B+
in the hope that it will discover some new flaws. To my
disappointment, it didn't.

--
Best regards,
Aleksander Alekseev

#59

Aleksander Alekseev

aleksander@timescale.com

over 3 years ago

In reply to: Aleksander Alekseev (#58)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

Here is a new version of the patchset. SLRU refactoring was moved to a
separate patch. Both v14-0003 (XID_FMT macro) and v14-0004 (SLRU
refactoring) can be delivered in PG15.

Here is a new version of the patchset. The changes compared to v14 are
minimal. Most importantly, the GCC warning reported by cfbot was
(hopefully) fixed. The patch order was also altered, v15-0001 and
v15-0002 are targeting PG15 now, the rest are targeting PG16.

Also for the record, I tested the patchset on Raspberry Pi 3 Model B+
in the hope that it will discover some new flaws. To my
disappointment, it didn't.

Here is the rebased version of the patchset. Also, I updated the
commit messages for v16-0001 and v16-002 to make them look more like
the rest of the PostgreSQL commit messages. They include the link to
this discussion now as well.

IMO v16-0001 and v16-0002 are in pretty good shape and are as much as
we are going to deliver in PG15. I'm going to change the status of the
CF entry to "Ready for Committer" somewhere this week unless someone
believes v16-0001 and/or v16-0002 shouldn't be merged.

--
Best regards,
Aleksander Alekseev

#60

Aleksander Alekseev

aleksander@timescale.com

over 3 years ago

In reply to: Aleksander Alekseev (#59)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

IMO v16-0001 and v16-0002 are in pretty good shape and are as much as
we are going to deliver in PG15. I'm going to change the status of the
CF entry to "Ready for Committer" somewhere this week unless someone
believes v16-0001 and/or v16-0002 shouldn't be merged.

Sorry for the missing attachment. Here it is.

--
Best regards,
Aleksander Alekseev

#61

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Aleksander Alekseev (#60)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi! Here is updated version of the patch, based on Alexander's ver16.

--
Best regards,
Maxim Orlov.

#62

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#61)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, Hackers!

Hi! Here is updated version of the patch, based on Alexander's ver16.

I'd like to add a few quick notes on what's been done in v17.

Patches 0001 and 0002 that are planned to be committed to PG15 are almost
unchanged with the exception of one unnecessary cast in 0002 removed.

We've also addressed several issues in patch 0005 (which is planned for
PG16):
- The bug with frozen xids after pg_upgrade, reported by Justin [1]/messages/by-id/20220115063925.GS14051@telsasoft.com
- Added proper processing of double xmax pages in
HeapPageSetPruneXidInternal()
- Fixed xids comparison. Initially in the patch it was changed to simple <
<= => > for 64 bit values. Now v17 patch has returned this to the way
similar to what is used in STABLE for 32-bit xids, but using modulus-64
numeric ring. The main goal of this change was to fix SRLU tests that
were mentioned
by Alexander to have been disabled. We've fixed and enabled most of them,
but some of them are still need to be fixed and enabled.

Also, we've pgindent-ed all the patches.

As patches that are planned to be delivered to PG15 are almost unchanged, I
completely agree with Alexander's plan to consider these patches (0001 and
0002) as RfC.

All activity, improvement, review, etc. related to the whole patchset is
also very much appreciated. Big thanks to Alexander for working on the
patch set!

[1]: /messages/by-id/20220115063925.GS14051@telsasoft.com
/messages/by-id/20220115063925.GS14051@telsasoft.com

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#63

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#62)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, Hackers!

Hi! Here is updated version of the patch, based on Alexander's ver16.

I'd like to add a few quick notes on what's been done in v17.

Patches 0001 and 0002 that are planned to be committed to PG15 are almost
unchanged with the exception of one unnecessary cast in 0002 removed.

We've also addressed several issues in patch 0005 (which is planned for
PG16):
- The bug with frozen xids after pg_upgrade, reported by Justin [1]
- Added proper processing of double xmax pages in
HeapPageSetPruneXidInternal()
- Fixed xids comparison. Initially in the patch it was changed to simple <
<= => > for 64 bit values. Now v17 patch has returned this to the way
similar to what is used in STABLE for 32-bit xids, but using modulus-64
numeric ring. The main goal of this change was to fix SRLU tests that were mentioned
by Alexander to have been disabled. We've fixed and enabled most of them,
but some of them are still need to be fixed and enabled.

Also, we've pgindent-ed all the patches.

As patches that are planned to be delivered to PG15 are almost unchanged,
I completely agree with Alexander's plan to consider these patches (0001
and 0002) as RfC.

All activity, improvement, review, etc. related to the whole patchset is
also very much appreciated. Big thanks to Alexander for working on the
patch set!

[1]
/messages/by-id/20220115063925.GS14051@telsasoft.com

Also, the patch v17 (0005) returns SLRU_PAGES_PER_SEGMENT to the previous
value of 32.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#64

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#63)

Re: Add 64-bit XIDs into PostgreSQL 15

At Mon, 14 Mar 2022 19:43:40 +0400, Pavel Borisov <pashkin.elfe@gmail.com> wrote in

I'd like to add a few quick notes on what's been done in v17.

I have some commens by a quick look-through. Apologize in advance for
wrong comments from the lack of the knowledge of the whole patch-set.

Patches 0001 and 0002 that are planned to be committed to PG15 are almost
unchanged with the exception of one unnecessary cast in 0002 removed.

0001:

The XID_FMT has quite bad impact on the translatability of error
messages. 3286065651 has removed INT64_FORMAT from translatable
texts for the reason. This re-introduces that in several places.
0001 itself does not harm but 0005 replaces XID_FMT with
INT64_FORMAT. Other patches have the same issue, too.

We've also addressed several issues in patch 0005 (which is planned for
PG16):
- The bug with frozen xids after pg_upgrade, reported by Justin [1]
- Added proper processing of double xmax pages in
HeapPageSetPruneXidInternal()
- Fixed xids comparison. Initially in the patch it was changed to simple <
<= => > for 64 bit values. Now v17 patch has returned this to the way
similar to what is used in STABLE for 32-bit xids, but using modulus-64
numeric ring. The main goal of this change was to fix SRLU tests that were mentioned

If IIUC, the following part in 0002 doesn't consider wraparound.

-asyncQueuePagePrecedes(int p, int q)
+asyncQueuePagePrecedes(int64 p, int64 q)
 {
-	return asyncQueuePageDiff(p, q) < 0;
+	return p < q;
 }

by Alexander to have been disabled. We've fixed and enabled most of them,
but some of them are still need to be fixed and enabled.

Also, we've pgindent-ed all the patches.

0005 has "new blank line at EOF".

As patches that are planned to be delivered to PG15 are almost unchanged,
I completely agree with Alexander's plan to consider these patches (0001
and 0002) as RfC.

All activity, improvement, review, etc. related to the whole patchset is
also very much appreciated. Big thanks to Alexander for working on the
patch set!

[1]
/messages/by-id/20220115063925.GS14051@telsasoft.com

Also, the patch v17 (0005) returns SLRU_PAGES_PER_SEGMENT to the previous
value of 32.

0002 re-introduces pg_strtouint64, which have been recently removed by
3c6f8c011f.

Simplify the general-purpose 64-bit integer parsing APIs

pg_strtouint64() is a wrapper around strtoull/strtoul/_strtoui64, but
it seems no longer necessary to have this indirection.

type definition int64/uint64. For that, add new macros strtoi64() and
strtou64() in c.h as thin wrappers around strtol()/strtoul() or
strtoll()/stroull(). This makes these functions available everywhere

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#65

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Kyotaro Horiguchi (#64)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Kyotaro!

0001:

The XID_FMT has quite bad impact on the translatability of error
messages. 3286065651 has removed INT64_FORMAT from translatable
texts for the reason. This re-introduces that in several places.
0001 itself does not harm but 0005 replaces XID_FMT with
INT64_FORMAT. Other patches have the same issue, too.

I do understand your concern and I wonder how I can do this better? My
first intention was to replace XID_FMT with %llu and INT64_FORMAT with
%lld. This should solve the translatability issue, but I'm not sure about
portability of this. Should this work on Windows, etc? Can you advise me on
the best solution?

We've fixed all the other things mentioned. Thanks!

Also added two fixes:
- CF bot was unhappy with pg_upgrade test in v17 because I forgot to add a
fix for computation of relminmxid during vacuum on a fresh database.
- Replace frozen or invalid x_min with FrozenTransactionId or
InvalidTransactionId respectively during tuple conversion to 64xid.

Reviews are welcome as always! Thanks!
--
Best regards,
Maxim Orlov.

#66

Michael Banck

michael.banck@credativ.de

over 3 years ago

In reply to: Aleksander Alekseev (#59)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On Mon, Mar 14, 2022 at 01:32:04PM +0300, Aleksander Alekseev wrote:

IMO v16-0001 and v16-0002 are in pretty good shape and are as much as
we are going to deliver in PG15. I'm going to change the status of the
CF entry to "Ready for Committer" somewhere this week unless someone
believes v16-0001 and/or v16-0002 shouldn't be merged.

Not sure, but if you want more people to look at them, probably best
would be to start a new thread with just the v15-target patches. Right
now, one has to download your tarball, extract it and look at the
patches in there.

I hope v16-0001 and v16-0002 are small enough (I didn't do the above)
that they can just be attached normally?

Michael

--
Michael Banck
Team Lead PostgreSQL
Project Manager
Tel.: +49 2166 9901-171
Mail: michael.banck@credativ.de

credativ GmbH, HRB MÃ¶nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 MÃ¶nchengladbach
Management: Dr. Michael Meskes, Geoff Richardson, Peter Lilley

Our handling of personal data is subject to:
https://www.credativ.de/en/contact/privacy/

#67

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#65)

Re: Add 64-bit XIDs into PostgreSQL 15

At Tue, 15 Mar 2022 18:48:34 +0300, Maxim Orlov <orlovmg@gmail.com> wrote in

Hi Kyotaro!

0001:

The XID_FMT has quite bad impact on the translatability of error
messages. 3286065651 has removed INT64_FORMAT from translatable
texts for the reason. This re-introduces that in several places.
0001 itself does not harm but 0005 replaces XID_FMT with
INT64_FORMAT. Other patches have the same issue, too.

I do understand your concern and I wonder how I can do this better? My
first intention was to replace XID_FMT with %llu and INT64_FORMAT with
%lld. This should solve the translatability issue, but I'm not sure about
portability of this. Should this work on Windows, etc? Can you advise me on
the best solution?

Doesn't doing "errmsg("blah blah %lld ..", (long long) xid)" work?

We've fixed all the other things mentioned. Thanks!

Also added two fixes:
- CF bot was unhappy with pg_upgrade test in v17 because I forgot to add a
fix for computation of relminmxid during vacuum on a fresh database.
- Replace frozen or invalid x_min with FrozenTransactionId or
InvalidTransactionId respectively during tuple conversion to 64xid.

Reviews are welcome as always! Thanks!

My pleasure.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#68

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Kyotaro Horiguchi (#67)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Shame on me, in v18 I forgot to add a fix for pg_verifybackup. Here is a
new version. I think this version will make CF bot happy.

--
Best regards,
Maxim Orlov.

#69

Aleksander Alekseev

aleksander@timescale.com

over 3 years ago

In reply to: Maxim Orlov (#68)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

I forked this thread as requested by several people in the discussion [1]/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com.

The new thread contains two patches that are targeting PG15. I replaced the
thread in the current CF to [1]/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com. This thread was added to the next CF. I
suggest we continue discussing changes targeting PG >= 16 here.

[1]: /messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com
/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com

--
Best regards,
Aleksander Alekseev

#70

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Aleksander Alekseev (#69)

Re: Add 64-bit XIDs into PostgreSQL 15

I forked this thread as requested by several people in the discussion [1].

The new thread contains two patches that are targeting PG15. I replaced
the thread in the current CF to [1]. This thread was added to the next CF.
I suggest we continue discussing changes targeting PG >= 16 here.

[1]:
/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com

Thanks!
We're planning to add 0001 and 0002 with next v20 to the mentioned [1]
thread to deliver them into v15
In this thread we'll add full patchset v20 as 0003, 0004 and 0005 are not
supposed to be committed without 0001 and 0002 anyway.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#71

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#70)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

We've revised the whole patch set.
Things changed:
- use localizable printf format, compatible with 32 and 64 bit xids
- replace str2unt64 and similar functions with strtou64 call
- rebase onto current master branch
- use proper type modifiers for sscanf calls

What about adding 0003 patch into [1]/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com to deliver it into PG15. I think it's
quite possible.

[1]: /messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com
/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com

--
Best regards,
Maxim Orlov.

#72

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#71)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is v22 with following changes:
- use explicit unsigned long long cast for printf/elog XIDs instead of
macro XID_TYPE
- add *.po localization
- fix forgotten XIDs format changes in pg_resetwal.c
- 0006 patch refactoring

Your reviews are very welcome!

--
Best regards,
Maxim Orlov.

#73

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#72)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, hackers!

While working on the patchset I've noticed that FullTransactionId lost its
semantics in its scope. TransactionId is supposed to be 64bit (default) and
epoch approach becomes outdated. What do you think of fully removing
FullTransactionId and its support functions and macro?

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#74

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Maxim Orlov (#72)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On 2022-03-18 18:22:00 +0300, Maxim Orlov wrote:

Here is v22 with following changes:
- use explicit unsigned long long cast for printf/elog XIDs instead of
macro XID_TYPE
- add *.po localization
- fix forgotten XIDs format changes in pg_resetwal.c
- 0006 patch refactoring

FWIW, 0006 still does way way too many things at once, even with 0001-0003
split out. I don't really see a point in reviewing / posting new versions
until that's done.

Greetings,

Andres Freund

#75

Justin Pryzby

pryzby@telsasoft.com

over 3 years ago

In reply to: Aleksander Alekseev (#58)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, Mar 11, 2022 at 08:26:26PM +0300, Aleksander Alekseev wrote:

Hi hackers,

Here is a new version of the patchset. SLRU refactoring was moved to a
separate patch. Both v14-0003 (XID_FMT macro) and v14-0004 (SLRU
refactoring) can be delivered in PG15.

Here is a new version of the patchset. The changes compared to v14 are
minimal. Most importantly, the GCC warning reported by cfbot was
(hopefully) fixed. The patch order was also altered, v15-0001 and
v15-0002 are targeting PG15 now, the rest are targeting PG16.

Do you know that you can test a branch on cirrus without using CF bot or
mailing the patch to the list ? See src/tools/ci/README

--
Justin

#76

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Justin Pryzby (#75)

Re: Add 64-bit XIDs into PostgreSQL 15

Do you know that you can test a branch on cirrus without using CF bot or
mailing the patch to the list ? See src/tools/ci/README

Yes, sure! The main reason to post updates of this patchset is for hackers
that are interested in the progress have relevant version with updates.
This patch is not for cfbot (I suppose it even don't trigger cfbot as it is
attached to the next CF.

Pavel.

Show quoted text

#77

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#76)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Just to make this thread up to date, here is patch v30 relevant to the
changes in [1]/messages/by-id/CALT9ZEHLGm6dUd2SdpJxDd5t9wLYXDm6OGFgZS_8jDzFSnLvvQ@mail.gmail.com.
I've been running make check-world for about 2 days on Ubuntu 64 bit and
Debian 32 bit and I don't have any failures.

[1]: /messages/by-id/CALT9ZEHLGm6dUd2SdpJxDd5t9wLYXDm6OGFgZS_8jDzFSnLvvQ@mail.gmail.com
/messages/by-id/CALT9ZEHLGm6dUd2SdpJxDd5t9wLYXDm6OGFgZS_8jDzFSnLvvQ@mail.gmail.com

--
Best regards,
Maxim Orlov.

#78

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#77)

7 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is a rebased version of the path v31.

Things changed:
- refactoring lazy page conversion after upgrade from prev versions
- refactoring slru segment resize
- remove U64FromFullTransactionId and FullTransactionIdFromU64
- compatibility changes with recent improvements in vacuum
- move 64-bit XID related tests from test.sh into 002_pg_upgrade.pl

--
Best regards,
Maxim Orlov.

#79

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#78)

7 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is a rebased version of the path v32.

Things changed:
- revert unneeded 64-bit xid related changes on sequence pages
- fix in convert_heap
- fix new page init when full_page_write=off
- added test.sh changes into new 002_pg_upgrade.pl
- and massive refactoring of page conversions after pu_upgrade
- made xid type alignment compatible with AIX double alignment

Review are very welcome!

--
Best regards,
Maxim Orlov.

#80

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#79)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is an updated version of patch.
Major changes are:
- single out options to initialize cluster with given xid/mxid/mxoff into
separate patch 0004 with a purpose of review and apply it separately before
the main patch.
We also created a separate CF entry to handle this [1]/messages/by-id/CACG=ezaa4vqYjJ16yoxgrpa-=gXnf0Vv3Ey9bjGrRRFN2YyWFQ@mail.gmail.com.
- add unit tests for lazy page conversion from 32 to 64 bits xid format
(inside patch 0008).
- make logical replication of xid format to be 64 bit and add test (inside
patch 0008).
- remove unnecessary padding to compactify XLogRecord
- 32 to 64 bit page lazy conversion refactoring
- rebase to recent upstream branch

Patches 0001-0003 are identical to the v33 from Aleksander Alekseev in
thread [2]/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com.

[1]: /messages/by-id/CACG=ezaa4vqYjJ16yoxgrpa-=gXnf0Vv3Ey9bjGrRRFN2YyWFQ@mail.gmail.com
/messages/by-id/CACG=ezaa4vqYjJ16yoxgrpa-=gXnf0Vv3Ey9bjGrRRFN2YyWFQ@mail.gmail.com
[2]: /messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com
/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com

Reviews are very welcome!

--
Best regards,
Maxim Orlov.

#81

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#80)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is a new rebased version of the patchset plus minor changes:
- return initial cluster xid to be FirstNormalTransactionId
- remove unnecessary truncation in TruncateSUBTRANS

Only 0008 patch is changed.

Reviews are very welcome!
--
Best regards,
Maxim Orlov.

#82

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#81)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is a rebased and improved version of the patchset: now we use 64 bit
atomic operations on shared memory which was not previously warrantied on
32 bit architectures.

Before this change under heavy transaction concurrency we've got a warning
"xmin is far in the past", rarely. This was seen on 32 bit architectures.

On 64 bit these changes do not change performance since 64 bit atomicity is
automatically fulfilled.

Only 0008 patch is changed.

--
Best regards,
Maxim Orlov.

#83

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#82)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, hackers!

I've updated a patchset for 64-xid (actually only 0008 patch is changed).

The update addresses a corner case of not completing VACUUM FULL after
pg_upgrade from the cluster containing a maximum size tuple in plain
storage. Page with such tuples can not be converted to 64-xid format as
there is no room for HeapPageSpecial, so it remains in DoubleXmax format
and this can not be changed until that tuple version is deleted. The change
makes VACUUM FULL copy these pages instead of throwing an error.

The patchset is also rebased onto a current master branch.

Your discussion and thoughts are very much welcome!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#84

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#83)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, hackers!
The patch stopped applying due to upstream changes.
I've rebased it again. PFA v38.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

#85

Justin Pryzby

pryzby@telsasoft.com

over 3 years ago

In reply to: Pavel Borisov (#84)

Re: Add 64-bit XIDs into PostgreSQL 15

This seems to be causing cfbot/cirrusci to time out.

Here's the build history
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/38/3594

https://cirrus-ci.com/task/4809278652416000 4 weeks ago on macos
https://cirrus-ci.com/task/5559884417597440 2 weeks ago on macos
https://cirrus-ci.com/task/6629554545491968 2 weeks ago on macos
https://cirrus-ci.com/task/5253255562264576 this week on freebsd

It seems like there's a larger discussion to be had about the architecture
about the patch, but I thought I'd point out this detail.

#86

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Justin Pryzby (#85)

Re: Add 64-bit XIDs into PostgreSQL 15

This seems to be causing cfbot/cirrusci to time out.

Here's the build history
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/38/3594

https://cirrus-ci.com/task/4809278652416000 4 weeks ago on macos
https://cirrus-ci.com/task/5559884417597440 2 weeks ago on macos
https://cirrus-ci.com/task/6629554545491968 2 weeks ago on macos
https://cirrus-ci.com/task/5253255562264576 this week on freebsd

It seems like there's a larger discussion to be had about the architecture
about the patch, but I thought I'd point out this detail.

Thanks! Will check it out.

Pavel

#87

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#86)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is a new version of the patchset with following changes:
- change unit test of page_conversion to address rare cfbot fails (the
reason of cfbot to time out was the unit test not written accurate enough);
- fix pg_upgarde on 32-bit systems;
- switch to XidList type (introduced by f10a025cfe97c1a34) for logical
replication, abandoning previously used Int64List type;
- use cat version instead of major version as a boundary from 32 to 64 bit
xids in pg_upgrade;
- this cat version boundary temporary set to 999999999 for pg_upgrade
testing purpose;
- also rebased to the actual master branch.

On 32-bit arch we have noticed pg_upgrade from 32 to 64 bit xids fails due
to different TOAST_MAX_CHUNK_SIZE. On a 64-bit xid page we have less
available space due to adding a heap page special. This leads to recalc of
TOAST_MAX_CHUNK_SIZE.

This was not a problem on 64 bit architectures, as padding bytes on 32-bit
xids on TOAST pages were enough to accommodate heap page special of 64-bit
xids with TOAST_MAX_CHUNK_SIZE unchanged. On 32 bits architectures padding
bytes were not enough and this needed TOAST_MAX_CHUNK_SIZE to be of
different size (on 64-bit xids version).

Changes of TOAST_MAX_CHUNK_SIZE lead to being unable to pg_upgrade onto
64-bit xids. This was a real problem, since TOAST of relation requires all
chunks to be the same size. In other words, we can not mix TOAST chunks of
previous (32-bit xid TOAST pages) with the new one with 64-bit xid TOAST
pages.

The solution was to use different specials for TOAST and heap pages. Since,
TOAST tuples can not have multixacts and does not need pd_multi_base on
page.

Thus, v39 is improved relative to v38 a lot.

As always, feel free to review and share your thoughts on a subject.

--
Best regards,
Maxim Orlov.

#88

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#87)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

The patchset stops applying, so here is a rebased version.

Also we've forgotten including atomic shared memory xid access from patch
v36 to v39.
So, add these changes here in v40.

As always, reviews are very welcome!

--
Best regards,
Maxim Orlov.

#89

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#88)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Due to changes in upstream patchset stop applying.
Here is rebased version with minor refactoring in contrib/pageinspect.

--
Best regards,
Maxim Orlov.

#90

Maxim Orlov

orlovmg@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#89)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Once again, due to changes in upstream patchset stop applying.
Here is a rebased version with minor refactoring in sequence.c.
Also remove unnecessary refactoring of pg_upgrade/file.c and return it to
upstream state.

--
Best regards,
Maxim Orlov.

#91

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Maxim Orlov (#90)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, hackers!
v42 stopped applying, so we rebased it to v43. Attached is a GitHub link,
but I see Cfbot hasn't become green. Apparently, it hasn't seen changes in
GitHub link relative to v42 attached as a patch.

--
Best regards,
Pavel Borisov

#92

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#91)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, 15 Jul 2022 at 15:20, Pavel Borisov <pashkin.elfe@gmail.com> wrote:

Hi, hackers!
v42 stopped applying, so we rebased it to v43. Attached is a GitHub link,
but I see Cfbot hasn't become green. Apparently, it hasn't seen changes in
GitHub link relative to v42 attached as a patch.

Github link is as follows:
https://github.com/ziva777/postgres/tree/64xid-cf
Maybe this will enable CFbot to see it.

--
Best regards,
Pavel Borisov

#93

Justin Pryzby

pryzby@telsasoft.com

over 3 years ago

In reply to: Pavel Borisov (#92)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, Jul 15, 2022 at 03:23:29PM +0400, Pavel Borisov wrote:

On Fri, 15 Jul 2022 at 15:20, Pavel Borisov <pashkin.elfe@gmail.com> wrote:

Hi, hackers!
v42 stopped applying, so we rebased it to v43. Attached is a GitHub link,
but I see Cfbot hasn't become green. Apparently, it hasn't seen changes in
GitHub link relative to v42 attached as a patch.

Github link is as follows:
https://github.com/ziva777/postgres/tree/64xid-cf
Maybe this will enable CFbot to see it.

My suggestion was a bit ambiguous, sorry for the confusion.

The "Git link" is just an annotation - cfbot doesn't try to do anything with
it[0](I think cfbot *could* try to read the git link, and apply patches that it finds there, but that's an idea that hasn't been proposed or discussed. It'd need to know which branch to use, and it'd need to know when to use the git link and when to use the most-recent email attachments). (but cirrusci will run the same checks under your github account).
What I meant was that it doesn't seem imporant to send rebases multiple times
per week if there's been no other changes. Anyone is still able to review the
patch. It's possible to build it by applying the patch to a checkout of the
postgres tree at the time the patch was sent. Or by using the git link. Or by
checking out cfbot's branch here, from the last time it *did* apply.
https://github.com/postgresql-cfbot/postgresql/tree/commitfest/38/3594

[0]: (I think cfbot *could* try to read the git link, and apply patches that it finds there, but that's an idea that hasn't been proposed or discussed. It'd need to know which branch to use, and it'd need to know when to use the git link and when to use the most-recent email attachments).
finds there, but that's an idea that hasn't been proposed or discussed. It'd
need to know which branch to use, and it'd need to know when to use the git
link and when to use the most-recent email attachments).

--
Justin

#94

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Justin Pryzby (#93)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, 15 Jul 2022 at 16:17, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Fri, Jul 15, 2022 at 03:23:29PM +0400, Pavel Borisov wrote:

On Fri, 15 Jul 2022 at 15:20, Pavel Borisov <pashkin.elfe@gmail.com>

wrote:

Hi, hackers!
v42 stopped applying, so we rebased it to v43. Attached is a GitHub

link,

but I see Cfbot hasn't become green. Apparently, it hasn't seen

changes in

GitHub link relative to v42 attached as a patch.

Github link is as follows:
https://github.com/ziva777/postgres/tree/64xid-cf
Maybe this will enable CFbot to see it.

My suggestion was a bit ambiguous, sorry for the confusion.

The "Git link" is just an annotation - cfbot doesn't try to do anything
with
it[0] (but cirrusci will run the same checks under your github account).
What I meant was that it doesn't seem imporant to send rebases multiple
times
per week if there's been no other changes. Anyone is still able to review
the
patch. It's possible to build it by applying the patch to a checkout of
the
postgres tree at the time the patch was sent. Or by using the git link.
Or by
checking out cfbot's branch here, from the last time it *did* apply.
https://github.com/postgresql-cfbot/postgresql/tree/commitfest/38/3594

[0] (I think cfbot *could* try to read the git link, and apply patches
that it
finds there, but that's an idea that hasn't been proposed or discussed.
It'd
need to know which branch to use, and it'd need to know when to use the git
link and when to use the most-recent email attachments).

Hi, Justin!

I can agree with you that sending rebased patches too often can be a little
annoying. On the other hand, otherwise, it's just red in Cfbot. I suppose
it's much easier and more comfortable to review the patches that at least
apply cleanly and pass all tests. So if Cfbot is red for a long time I feel
we need to send a rebased patchset anyway.

I'll try to not doing this too often but frankly, I don't see a better
alternative at the moment.

Anyway, big thanks for your advice and attention to this thread!

--
Best regards,
Pavel Borisov

#95

Aleksander Alekseev

aleksander@timescale.com

over 3 years ago

In reply to: Pavel Borisov (#94)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

I can agree with you that sending rebased patches too often can be a little annoying. On the other hand, otherwise, it's just red in Cfbot. I suppose it's much easier and more comfortable to review the patches that at least apply cleanly and pass all tests. So if Cfbot is red for a long time I feel we need to send a rebased patchset anyway.

I'll try to not doing this too often but frankly, I don't see a better alternative at the moment.

Considering the overall activity on the mailing list personally I
don't see a problem here. Several extra emails don't bother me at all,
but I would like to see a green cfbot report for an open item in the
CF application. Otherwise someone will complain that the patch doesn't
apply anymore and the result will be the same as for sending an
updated patch, except that we will receive at least two emails instead
of one.

--
Best regards,
Aleksander Alekseev

#96

Pavel Borisov

pashkin.elfe@gmail.com

over 3 years ago

In reply to: Aleksander Alekseev (#95)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

I can agree with you that sending rebased patches too often can be a

little annoying. On the other hand, otherwise, it's just red in Cfbot. I
suppose it's much easier and more comfortable to review the patches that at
least apply cleanly and pass all tests. So if Cfbot is red for a long time
I feel we need to send a rebased patchset anyway.

I'll try to not doing this too often but frankly, I don't see a better

alternative at the moment.

Considering the overall activity on the mailing list personally I
don't see a problem here. Several extra emails don't bother me at all,
but I would like to see a green cfbot report for an open item in the
CF application. Otherwise someone will complain that the patch doesn't
apply anymore and the result will be the same as for sending an
updated patch, except that we will receive at least two emails instead
of one.

Hi, Alexander!
Agree with you. I also consider green cfbot entry important. So PFA rebased
v43.

--
Best regards,
Pavel Borisov

#97

Justin Pryzby

pryzby@telsasoft.com

over 3 years ago

In reply to: Justin Pryzby (#10)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Jan 05, 2022 at 06:12:26PM -0600, Justin Pryzby wrote:

On Wed, Jan 05, 2022 at 06:51:37PM -0500, Bruce Momjian wrote:

On Tue, Jan 4, 2022 at 10:22:50PM +0000, Finnerty, Jim wrote:

I'm concerned about the maintainability impact of having 2 new
on-disk page formats. It's already complex enough with XIDs and
multixact-XIDs.

If the lack of space for the two epochs in the special data area is
a problem only in an upgrade scenario, why not resolve the problem
before completing the upgrade process like a kind of post-process
pg_repack operation that converts all "double xmax" pages to
the "double-epoch" page format? i.e. maybe the "double xmax"
representation is needed as an intermediate representation during
upgrade, but after upgrade completes successfully there are no pages
with the "double-xmax" representation. This would eliminate a whole
class of coding errors and would make the code dealing with 64-bit
XIDs simpler and more maintainable.

Well, yes, we could do this, and it would avoid the complexity of having
to support two XID representations, but we would need to accept that
fast pg_upgrade would be impossible in such cases, since every page
would need to be checked and potentially updated.

You might try to do this while the server is first started and running
queries, but I think we found out from the online checkpoint patch that

I think you meant the online checksum patch. Which this reminded me of, too.

I wondered whether anyone had considered using relation forks to maintain state
of these long, transitional processes.

Either a whole new fork, or additional bits in the visibility map, which has
page-level bits.

There'd still need to be a flag somewhere indicating whether
checksums/xid64s/etc were enabled cluster-wide. The VM/fork bits would need to
be checked while the cluster was being re-processed online. This would add
some overhead. After the cluster had reached its target state, the flag could
be set, and the VM bits would no longer need to be checked.

--
Justin

#98

Dilip Kumar

dilipbalaut@gmail.com

over 3 years ago

In reply to: Pavel Borisov (#96)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Jul 18, 2022 at 2:54 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

I can agree with you that sending rebased patches too often can be a little annoying. On the other hand, otherwise, it's just red in Cfbot. I suppose it's much easier and more comfortable to review the patches that at least apply cleanly and pass all tests. So if Cfbot is red for a long time I feel we need to send a rebased patchset anyway.

I'll try to not doing this too often but frankly, I don't see a better alternative at the moment.

Considering the overall activity on the mailing list personally I
don't see a problem here. Several extra emails don't bother me at all,
but I would like to see a green cfbot report for an open item in the
CF application. Otherwise someone will complain that the patch doesn't
apply anymore and the result will be the same as for sending an
updated patch, except that we will receive at least two emails instead
of one.

Hi, Alexander!
Agree with you. I also consider green cfbot entry important. So PFA rebased v43.

Since we have converted TransactionId to 64-bit, so do we still need
the concept of FullTransactionId? I mean it is really confusing to
have 3 forms of transaction ids. i.e. Transaction Id,
FullTransactionId and ShortTransactionId.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#99

Dilip Kumar

dilipbalaut@gmail.com

over 3 years ago

In reply to: Dilip Kumar (#98)

Re: Add 64-bit XIDs into PostgreSQL 15

On Sun, Sep 4, 2022 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Jul 18, 2022 at 2:54 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

I can agree with you that sending rebased patches too often can be a little annoying. On the other hand, otherwise, it's just red in Cfbot. I suppose it's much easier and more comfortable to review the patches that at least apply cleanly and pass all tests. So if Cfbot is red for a long time I feel we need to send a rebased patchset anyway.

I'll try to not doing this too often but frankly, I don't see a better alternative at the moment.

Considering the overall activity on the mailing list personally I
don't see a problem here. Several extra emails don't bother me at all,
but I would like to see a green cfbot report for an open item in the
CF application. Otherwise someone will complain that the patch doesn't
apply anymore and the result will be the same as for sending an
updated patch, except that we will receive at least two emails instead
of one.

Hi, Alexander!
Agree with you. I also consider green cfbot entry important. So PFA rebased v43.

Since we have converted TransactionId to 64-bit, so do we still need
the concept of FullTransactionId? I mean it is really confusing to
have 3 forms of transaction ids. i.e. Transaction Id,
FullTransactionId and ShortTransactionId.

I have done some more reviews to understand the idea. I think this
patch needs far more comments to make it completely readable.

1.
 typedef struct HeapTupleData
 {
+ TransactionId t_xmin; /* base value for normal transaction ids */
+ TransactionId t_xmax; /* base value for mutlixact */

I think the field name and comments are not in sync, field says xmin
and xmax whereas the comment says base value for
transaction id and multi-xact.

2.
extern bool heap_page_prepare_for_xid(Relation relation, Buffer buffer,
TransactionId xid, bool multi);

I noticed that this function is returning bool but all the callers are
ignoring the return type.

3.
+static int
+heap_page_try_prepare_for_xid(Relation relation, Buffer buffer, Page page,
+   TransactionId xid, bool multi, bool is_toast)
+{
+ TransactionId base;
+ ShortTransactionId min = InvalidTransactionId,

add function header comments.

+ if (!multi)
+ {
+ Assert(!is_toast || !(htup->t_infomask & HEAP_XMAX_IS_MULTI));
+
+ if (TransactionIdIsNormal(htup->t_choice.t_heap.t_xmin) &&
+ !HeapTupleHeaderXminFrozen(htup))
+ {
+ xid_min_max(min, max, htup->t_choice.t_heap.t_xmin, &found);
+ }
+
+ if (htup->t_infomask & HEAP_XMAX_INVALID)
+ continue;
+
+ if ((htup->t_infomask & HEAP_XMAX_IS_MULTI) &&
+ (!(htup->t_infomask & HEAP_XMAX_LOCK_ONLY)))
+ {
+ TransactionId update_xid;
+ ShortTransactionId xid;
+
+ Assert(!is_toast);
+ update_xid = MultiXactIdGetUpdateXid(HeapTupleHeaderGetRawXmax(page, htup),
+ htup->t_infomask);
+ xid = NormalTransactionIdToShort(HeapPageGetSpecial(page)->pd_xid_base,
+ update_xid);
+
+ xid_min_max(min, max, xid, &found);
+ }
+ }

Why no handling for multi? And this function has absolutely no
comments to understand the reason for this.

5.
+ if (IsToastRelation(relation))
+ {
+ PageInit(page, BufferGetPageSize(buffer), sizeof(ToastPageSpecialData));
+ ToastPageGetSpecial(page)->pd_xid_base = RecentXmin -
FirstNormalTransactionId;
+ }
+ else
+ {
+ PageInit(page, BufferGetPageSize(buffer), sizeof(HeapPageSpecialData));
+ HeapPageGetSpecial(page)->pd_xid_base = RecentXmin - FirstNormalTransactionId;
+ }

Why pd_xid_base can not be just RecentXmin? Please explain in the
comments above.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#100

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Dilip Kumar (#99)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Thank you for your review.

Since we have converted TransactionId to 64-bit, so do we still need

the concept of FullTransactionId? I mean it is really confusing to
have 3 forms of transaction ids. i.e. Transaction Id,
FullTransactionId and ShortTransactionId.

Yeah, I totally agree with you. Actually, it is better to get rid of them,
if this patch set will be committed.
We've already tried to do some experiments on this issue. But,
unfortunately, this resulted in bloating
the patch set. So, we decided to address this in the future.

typedef struct HeapTupleData
{
+ TransactionId t_xmin; /* base value for normal transaction ids */
+ TransactionId t_xmax; /* base value for mutlixact */
I think the field name and comments are not in sync, field says xmin
and xmax whereas the comment says base value for
transaction id and multi-xact.

Fixed.

2.
extern bool heap_page_prepare_for_xid(Relation relation, Buffer buffer,
TransactionId xid, bool multi);

I noticed that this function is returning bool but all the callers are
ignoring the return type.

Fixed.

3.
+static int
+heap_page_try_prepare_for_xid(Relation relation, Buffer buffer, Page page,
+   TransactionId xid, bool multi, bool is_toast)
+{
+ TransactionId base;
+ ShortTransactionId min = InvalidTransactionId,

add function header comments.

Fixed. Also, I made some refactoring to make this more clear.

4.
Why no handling for multi? And this function has absolutely no
comments to understand the reason for this.

Actually, this function works for multi transactions as well as for
"regular" transactions.
But in case of "regular" transactions, we have to look through multi
transactions to
see if any update transactions for particular tuple is present or not.
I add comments around here to make it clear.

5.
Why pd_xid_base can not be just RecentXmin? Please explain in the
comments above.

We're doing this, If I'm not mistaken, to be able to get all the possible
XID's
values, include InvalidTransactionId, FrozenTransactionId and so on. In
other
words, me must be able to get XID's values including special ones.

Here is a rebased version of a patch set. As always, reviews are very
welcome!

--
Best regards,
Maxim Orlov.

#101

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Maxim Orlov (#100)

Re: Add 64-bit XIDs into PostgreSQL 15

I have to say, to my embarrassment, after sending the previous email, I've
notice minor imperfections in a patch set caused by the last rebase.
These imperfections led to cf bot fail. I'll address this issue in the next
iteration in order not to generate excessive flow.

--
Best regards,
Maxim Orlov.

#102

Zhang Mingli

zmlpostgres@gmail.com

about 3 years ago

In reply to: Maxim Orlov (#101)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

Is this patch target to PG16 now?

I want to have a look at these patches, but apply on master failed:

Applying: Use 64-bit numbering of SLRU pages.
Applying: Use 64-bit format to output XIDs
Applying: Use 64-bit FullTransactionId instead of Epoch:xid
Applying: Use 64-bit pages representation in SLRU callers.
error: patch failed: src/backend/access/transam/multixact.c:1228
error: src/backend/access/transam/multixact.c: patch does not apply
Patch failed at 0004 Use 64-bit pages representation in SLRU callers.
hint: Use 'git am --show-current-patch=diff' to see the failed patch

Regards,
Zhang Mingli

#103

Zhang Mingli

zmlpostgres@gmail.com

about 3 years ago

In reply to: Maxim Orlov (#101)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

With these patches, it seems that we don’t need to handle wraparound in
GetNextLocalTransactionId() too, as LocalTransactionId is unit64 now.

```
LocalTransactionId
GetNextLocalTransactionId(void)
{
LocalTransactionId result;

   /* loop to avoid returning InvalidLocalTransactionId at wraparound */
   do
   {
   result = nextLocalTransactionId++;
   } while (!LocalTransactionIdIsValid(result));

return result;
}
```

Regards,
Zhang Mingli

#104

Justin Pryzby

pryzby@telsasoft.com

about 3 years ago

In reply to: Zhang Mingli (#102)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Sep 20, 2022 at 03:37:47PM +0800, Zhang Mingli wrote:

I want to have a look at these patches, but apply on master failed:

Yeah, it's likely to break every week or more often.

You have a few options:

0) resolve the conflict yourself;

1) apply the patch to the commit that the authors sent it against, or
some commit before the conflicting file(s) were changed in master. Like
maybe "git checkout -b 64bitxids f66d997fd".

2) Use the last patch that cfbot successfully created. You can read the
patch on github's web interface, or add cfbot's user as a remote to use
the patch locally for review and/or compilation. Something like "git
remote add cfbot https://github.com/postgresql-cfbot/postgresql; git
fetch cfbot commitfest/39/3594; git checkout -b 64bitxids
cfbot/commitfest/39/3594". (Unfortunately, cfbot currently squishes the
patch series into a single commit and loses the commit message).

You could also check the git link in the commitfest, to see if the
author has already rebased it, but haven't yet mailed the rebased patch
to the list. In this case, that's not true, but you could probably use
the author's branch on github, too.
https://commitfest.postgresql.org/39/3594/

--
Justin

#105

Zhang Mingli

zmlpostgres@gmail.com

about 3 years ago

In reply to: Justin Pryzby (#104)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,
On Sep 20, 2022, 17:26 +0800, Justin Pryzby <pryzby@telsasoft.com>, wrote:

On Tue, Sep 20, 2022 at 03:37:47PM +0800, Zhang Mingli wrote:

I want to have a look at these patches, but apply on master failed:

Yeah, it's likely to break every week or more often.

You have a few options:

0) resolve the conflict yourself;

1) apply the patch to the commit that the authors sent it against, or
some commit before the conflicting file(s) were changed in master. Like
maybe "git checkout -b 64bitxids f66d997fd".

2) Use the last patch that cfbot successfully created. You can read the
patch on github's web interface, or add cfbot's user as a remote to use
the patch locally for review and/or compilation. Something like "git
remote add cfbot https://github.com/postgresql-cfbot/postgresql; git
fetch cfbot commitfest/39/3594; git checkout -b 64bitxids
cfbot/commitfest/39/3594". (Unfortunately, cfbot currently squishes the
patch series into a single commit and loses the commit message).

You could also check the git link in the commitfest, to see if the
author has already rebased it, but haven't yet mailed the rebased patch
to the list. In this case, that's not true, but you could probably use
the author's branch on github, too.
https://commitfest.postgresql.org/39/3594/

--
Justin

Got it, thanks.

Regards,
Zhang Mingli

#106

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Zhang Mingli (#105)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Here is a rebased version of the patch set.
Major changes are:
1. Fix rare replica fault.
Upon page pruning in heap_page_prune, page fragmentation repair is
determined by
a parameter repairFragmentation. At the same time, on a replica, upon
handling XLOG_HEAP2_PRUNE record type
in heap_xlog_prune, we always call heap_page_prune_execute with
repairFragmentation parameter equal to true.
This caused page inconsistency and lead to the crash of the replica. Fix
this by adding new flag in
struct xl_heap_prune.
2. Add support for meson build.
3. Add assertion "buffer is locked" in HeapTupleCopyBaseFromPage.
4. Add assertion "buffer is locked exclusive" in heap_page_shift_base.
5. Prevent excessive growth of xmax in heap_prepare_freeze_tuple.

As always, reviews are very welcome!

--
Best regards,
Maxim Orlov.

#107

Michael Paquier

michael@paquier.xyz

about 3 years ago

In reply to: Maxim Orlov (#106)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, Oct 07, 2022 at 02:04:09PM +0300, Maxim Orlov wrote:

As always, reviews are very welcome!

This patch set needs a rebase, as far as I can see.
--
Michael

#108

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Michael Paquier (#107)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

This patch set needs a rebase, as far as I can see.

Done! Thanks! Here is the rebased version.

This version has bug fix for multixact replication. Previous versions of
the patch set does not write pd_multi_base in WAL. Thus, this field was set
to 0 upon WAL reply on replica.
This caused replica to panic. Fix this by adding pd_multi_base of a page
into WAL. Appropriate tap test is added.

Also, add refactoring and improvements in heapam.c in order to reduce diff
and make it more "tidy".

Reviews and opinions are very welcome!

--
Best regards,
Maxim Orlov.

#109

Zhang Mingli

zmlpostgres@gmail.com

about 3 years ago

In reply to: Maxim Orlov (#108)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On Oct 22, 2022, 00:09 +0800, Maxim Orlov <orlovmg@gmail.com>, wrote:

Done! Thanks! Here is the rebased version.

This version has bug fix for multixact replication. Previous versions of the patch set does not write pd_multi_base in WAL. Thus, this field was set to 0 upon WAL reply on replica.
This caused replica to panic. Fix this by adding pd_multi_base of a page into WAL. Appropriate tap test is added.

Also, add refactoring and improvements in heapam.c in order to reduce diff and make it more "tidy".

Reviews and opinions are very welcome!

--
Best regards,
Maxim Orlov.

Found some outdate code comments around several variables, such as xidWrapLimit/xidWarnLimit/xidStopLimt.

These variables are not used any more.

I attach an additional V48-0009 patch as they are just comments, apply it if you want to.

Regards,
Zhang Mingli

#110

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Zhang Mingli (#109)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

I attach an additional V48-0009 patch as they are just comments, apply it

if you want to.

Big thank you for your review. I've applied your addition in the recent
patch set below.

Besides, mentioned above, next changes are made:
- rename HeapTupleCopyBaseFromPage to HeapTupleCopyXidsFromPage, since this
old name came from the time when еру "t_xid_base" was stored in tuple,
and not correspond to recent state of the code;
- replace ToastTupleHeader* calls with HeapHeader* with the "is_toast"
argument. This reduces diff and make the code more readable;
- put HeapTupleSetZeroXids calls in several places for the sake of
redundancy;
- in heap_tuple_would_freeze add case to reset xmax without reading clog;
- rename SeqTupleHeaderSetXmax/Xmin to SeqTupleSetXmax/min and refactoring
of the function; Now it will set HeapTuple and HeapTupleHeader xmax;
- add case of int64 values in check_GUC_init;
- massive refactoring in htup_details.h to use inline functions with type
control over macro;
- reorder code in htup_details.h to reduce overall diff.

As always, reviews and opinions are very welcome!

--
Best regards,
Maxim Orlov.

#111

Thom Brown

thom@linux.com

about 3 years ago

In reply to: Maxim Orlov (#110)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, 3 Nov 2022 at 08:12, Maxim Orlov <orlovmg@gmail.com> wrote:

Hi!

I attach an additional V48-0009 patch as they are just comments, apply it if you want to.

Big thank you for your review. I've applied your addition in the recent patch set below.

Besides, mentioned above, next changes are made:
- rename HeapTupleCopyBaseFromPage to HeapTupleCopyXidsFromPage, since this old name came from the time when еру "t_xid_base" was stored in tuple,
and not correspond to recent state of the code;
- replace ToastTupleHeader* calls with HeapHeader* with the "is_toast" argument. This reduces diff and make the code more readable;
- put HeapTupleSetZeroXids calls in several places for the sake of redundancy;
- in heap_tuple_would_freeze add case to reset xmax without reading clog;
- rename SeqTupleHeaderSetXmax/Xmin to SeqTupleSetXmax/min and refactoring of the function; Now it will set HeapTuple and HeapTupleHeader xmax;
- add case of int64 values in check_GUC_init;
- massive refactoring in htup_details.h to use inline functions with type control over macro;
- reorder code in htup_details.h to reduce overall diff.

As always, reviews and opinions are very welcome!

0008 needs a rebase. heapam.h and catversion.h are failing.

Regards

Thom

#112

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Thom Brown (#111)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

0008 needs a rebase. heapam.h and catversion.h are failing.

Regards

Thom

Thanks, done!

Also add copying of the xmin and xmax while page is locked. In heapgetpage
we have to copy tuples xmin and xmax while
we're holding a lock. Since we do not hold a lock after that, values of
xmin or xmax may be changed, and we may get
incorrect values of those fields. This affects only the scenario when the
user select xmin or xmax "directly" by SQL query, AFAICS.

--
Best regards,
Maxim Orlov.

#113

Aleksander Alekseev

aleksander@timescale.com

about 3 years ago

In reply to: Maxim Orlov (#112)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

Thanks, done!

Dilip Kumar asked a good question in the thread about the 0001..0003
subset [1]/messages/by-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A@mail.gmail.com. I would like to duplicate it here to make sure it was not
missed by mistake:

"""
Have we measured the WAL overhead because of this patch set? maybe
these particular patches will not impact but IIUC this is ground work
for making xid 64 bit. So each XLOG record size will increase at
least by 4 bytes because the XLogRecord contains the xid.
"""

Do we have an estimate on this?

[1]: /messages/by-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A@mail.gmail.com

--
Best regards,
Aleksander Alekseev

#114

Aleksander Alekseev

aleksander@timescale.com

about 3 years ago

In reply to: Aleksander Alekseev (#113)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

Dilip Kumar asked a good question in the thread about the 0001..0003
subset [1]. I would like to duplicate it here to make sure it was not
missed by mistake:

"""
Have we measured the WAL overhead because of this patch set? maybe
these particular patches will not impact but IIUC this is ground work
for making xid 64 bit. So each XLOG record size will increase at
least by 4 bytes because the XLogRecord contains the xid.
"""

Do we have an estimate on this?

I decided to simulate one completely synthetic and non-representative
scenario when we write multiple small WAL records.

Here is what I did:

$ psql -c 'CREATE TABLE phonebook(
"id" SERIAL PRIMARY KEY NOT NULL,
"name" TEXT NOT NULL,
"phone" INT NOT NULL);'

$ echo 'INSERT INTO phonebook (name, phone) VALUES (random(),
random());' > t.sql

$ pgbench -j 8 -c 8 -f t.sql -T 60 eax

== 32-bit XIDs ==

Branch: https://github.com/afiskon/postgres/tree/64bit_xids_v50_14Nov_without_patch

pgbench output:

```
number of transactions actually processed: 68650
number of failed transactions: 0 (0.000%)
latency average = 6.993 ms
initial connection time = 5.415 ms
tps = 1144.074340 (without initial connection time)
```

$ ls -lah /home/eax/projects/pginstall/data-master/pg_wal
...
-rw------- 1 eax eax 16M Nov 16 12:48 000000010000000000000002
-rw------- 1 eax eax 16M Nov 16 12:47 000000010000000000000003

$ pg_waldump -p ~/projects/pginstall/data-master/pg_wal
000000010000000000000002 000000010000000000000003 | perl -e 'while(<>)
{ $_ =~ m#len $rec/tot$:\s*(\d+)/\s*(\d+),#; $rec += $1; $tot += $2;
$count++; } $rec /= $count; $tot /= $count; print "rec: $rec, tot:
$tot\n";'

pg_waldump: error: error in WAL record at 0/28A4118: invalid record
length at 0/28A4190: wanted 24, got 0
rec: 65.8201835569952, tot: 67.3479022057689

== 64-bit XIDs ==

Branch: https://github.com/afiskon/postgres/tree/64bit_xids_v50_14Nov

pgbench output:

```
number of transactions actually processed: 68744
number of failed transactions: 0 (0.000%)
latency average = 6.983 ms
initial connection time = 5.334 ms
tps = 1145.664765 (without initial connection time)
```

$ ls -lah /home/eax/projects/pginstall/data-master/pg_wal
...
-rw------- 1 eax eax 16M Nov 16 12:32 000000010000000000000002
-rw------- 1 eax eax 16M Nov 16 12:31 000000010000000000000003

pg_waldump: error: error in WAL record at 0/29F4778: invalid record
length at 0/29F4810: wanted 26, got 0
rec: 69.1783950928736, tot: 70.6413934278527

So under this load with 64-bit XIDs we see ~5% penalty in terms of the
WAL size. This seems to be expected considering the fact that
sizeof(XLogRecord) became 4 bytes larger and the average record size
was about 66 bytes.

--
Best regards,
Aleksander Alekseev

#115

Chris Travers

chris.travers@gmail.com

about 3 years ago

In reply to: Aleksander Alekseev (#114)

Re: Add 64-bit XIDs into PostgreSQL 15

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

I have a very serious concern about the current patch set. as someone who has faced transaction id wraparound in the past.

I can start by saying I think it would be helpful (if the other issues are approached reasonably) to have 64-bit xids, but there is an important piece of context in reventing xid wraparounds that seems missing from this patch unless I missed something.

XID wraparound is a symptom, not an underlying problem. It usually occurs when autovacuum or other vacuum strategies have unexpected stalls and therefore fail to work as expected. Shifting to 64-bit XIDs dramatically changes the sorts of problems that these stalls are likely to pose to operational teams. -- you can find you are running out of storage rather than facing an imminent database shutdown. Worse, this patch delays the problem until some (possibly far later!) time, when vacuum will take far longer to finish, and options for resolving the problem are diminished. As a result I am concerned that merely changing xids from 32-bit to 64-bit will lead to a smaller number of far more serious outages.

What would make a big difference from my perspective would be to combine this with an inverse system for warning that there is a problem, allowing the administrator to throw warnings about xids since last vacuum, with a configurable threshold. We could have this at two billion by default as that would pose operational warnings not much later than we have now.

Otherwise I can imagine cases where instead of 30 hours to vacuum a table, it takes 300 hours on a database that is short on space. And I would not want to be facing such a situation.

The new status of this patch is: Waiting on Author

#116

Pavel Borisov

pashkin.elfe@gmail.com

about 3 years ago

In reply to: Chris Travers (#115)

Re: Add 64-bit XIDs into PostgreSQL 15

I have a very serious concern about the current patch set. as someone who has faced transaction id wraparound in the past.

I can start by saying I think it would be helpful (if the other issues are approached reasonably) to have 64-bit xids, but there is an important piece of context in reventing xid wraparounds that seems missing from this patch unless I missed something.

XID wraparound is a symptom, not an underlying problem. It usually occurs when autovacuum or other vacuum strategies have unexpected stalls and therefore fail to work as expected. Shifting to 64-bit XIDs dramatically changes the sorts of problems that these stalls are likely to pose to operational teams. -- you can find you are running out of storage rather than facing an imminent database shutdown. Worse, this patch delays the problem until some (possibly far later!) time, when vacuum will take far longer to finish, and options for resolving the problem are diminished. As a result I am concerned that merely changing xids from 32-bit to 64-bit will lead to a smaller number of far more serious outages.

What would make a big difference from my perspective would be to combine this with an inverse system for warning that there is a problem, allowing the administrator to throw warnings about xids since last vacuum, with a configurable threshold. We could have this at two billion by default as that would pose operational warnings not much later than we have now.

Otherwise I can imagine cases where instead of 30 hours to vacuum a table, it takes 300 hours on a database that is short on space. And I would not want to be facing such a situation.

Hi, Chris!
I had a similar stance when I started working on this patch. Of
course, it seemed horrible just to postpone the consequences of
inadequate monitoring, too long running transactions that prevent
aggressive autovacuum etc. So I can understand your point.

With time I've got to a little bit of another view of this feature i.e.

1. It's important to correctly set monitoring, the cut-off of long
transactions, etc. anyway. It's not the responsibility of vacuum
before wraparound to report inadequate monitoring etc. Furthermore, in
real life, this will be already too late if it prevents 32-bit
wraparound and invokes much downtime in an unexpected moment of time
if it occurs already. (The rough analogy for that is the machine
running at 120mph turns every control off and applies full brakes just
because the cooling liquid is low (of course there might be a warning
previously, but anyway))

2. The checks and handlers for the event that is never expected in the
cluster lifetime (~200 years at constant rate of 1e6 TPS) can be just
dropped. Of course we still need to do automatic routine maintenance
like cutting SLRU buffers (but with a much bigger interval if we have
much disk space e.g.). But I considered that we either can not care
what will be with cluster after > 200 years (it will be migrated many
times before this, on many reasons not related to Postgres even for
the most conservative owners). So the radical proposal is to drop
64-bit wraparound at all. The most moderate one is just not taking
very much care that after 200 years we have more hassle than next
month if we haven't set up everything correctly. Next month's pain
will be more significant even if it teaches dba something.

Big thanks for your view on the general implementation of this feature, anyway.

Kind regards,
Pavel Borisov.
Supabase

#117

Aleksander Alekseev

aleksander@timescale.com

about 3 years ago

In reply to: Pavel Borisov (#116)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

I have a very serious concern about the current patch set. as someone who has faced transaction id wraparound in the past.

[...]

I had a similar stance when I started working on this patch. Of
course, it seemed horrible just to postpone the consequences of
inadequate monitoring, too long running transactions that prevent
aggressive autovacuum etc. So I can understand your point.

With time I've got to a little bit of another view of this feature i.e.

1. It's important to correctly set monitoring, the cut-off of long
transactions, etc. anyway. It's not the responsibility of vacuum
before wraparound to report inadequate monitoring etc. Furthermore, in
real life, this will be already too late if it prevents 32-bit
wraparound and invokes much downtime in an unexpected moment of time
if it occurs already. (The rough analogy for that is the machine
running at 120mph turns every control off and applies full brakes just
because the cooling liquid is low (of course there might be a warning
previously, but anyway))

2. The checks and handlers for the event that is never expected in the
cluster lifetime (~200 years at constant rate of 1e6 TPS) can be just
dropped. Of course we still need to do automatic routine maintenance
like cutting SLRU buffers (but with a much bigger interval if we have
much disk space e.g.). But I considered that we either can not care
what will be with cluster after > 200 years (it will be migrated many
times before this, on many reasons not related to Postgres even for
the most conservative owners). So the radical proposal is to drop
64-bit wraparound at all. The most moderate one is just not taking
very much care that after 200 years we have more hassle than next
month if we haven't set up everything correctly. Next month's pain
will be more significant even if it teaches dba something.

Big thanks for your view on the general implementation of this feature, anyway.

I'm inclined to agree with Pavel on this one. Keeping 32-bit XIDs in
order to intentionally trigger XID wraparound to indicate the ending
disk space and/or misconfigured system (by the time when it's usually
too late anyway) is a somewhat arguable perspective. It would be great
to notify the user about the potential issues with the configuration
and/or the fact that VACUUM doesn't catch up. But it doesn't mean we
should keep 32-bit XIDs in order to achive this.

--
Best regards,
Aleksander Alekseev

#118

Peter Eisentraut

peter.eisentraut@enterprisedb.com

about 3 years ago

In reply to: Maxim Orlov (#112)

Re: Add 64-bit XIDs into PostgreSQL 15

Question about

"""
Subject: [PATCH v50 5/8] Add initdb option to initialize cluster with
non-standard xid/mxid/mxoff.

To date testing database cluster wraparund was not easy as initdb has always
inited it with default xid/mxid/mxoff. The option to specify any valid
xid/mxid/mxoff at cluster startup will make these things easier.
"""

Doesn't pg_resetwal already provide that functionality, or at least some
of it?

#119

Tom Lane

tgl@sss.pgh.pa.us

about 3 years ago

In reply to: Peter Eisentraut (#118)

Re: Add 64-bit XIDs into PostgreSQL 15

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

To date testing database cluster wraparund was not easy as initdb has always
inited it with default xid/mxid/mxoff. The option to specify any valid
xid/mxid/mxoff at cluster startup will make these things easier.

Doesn't pg_resetwal already provide that functionality, or at least some
of it?

pg_resetwal does seem like a better, more useful home for this; it'd
allow you to adjust these numbers after initial creation which might be
useful. I'm not sure how flexible it is right now in terms of where
you can set the new values, but that can always be improved.

regards, tom lane

#120

Andres Freund

andres@anarazel.de

about 3 years ago

In reply to: Tom Lane (#119)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On 2022-11-21 14:21:35 -0500, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

To date testing database cluster wraparund was not easy as initdb has always
inited it with default xid/mxid/mxoff. The option to specify any valid
xid/mxid/mxoff at cluster startup will make these things easier.

Doesn't pg_resetwal already provide that functionality, or at least some
of it?

pg_resetwal does seem like a better, more useful home for this; it'd
allow you to adjust these numbers after initial creation which might be
useful. I'm not sure how flexible it is right now in terms of where
you can set the new values, but that can always be improved.

IIRC the respective pg_resetwal parameters are really hard to use for
something like this, because they don't actually create the respective
SLRU segments. We of course could fix that.

Greetings,

Andres Freund

#121

Tom Lane

tgl@sss.pgh.pa.us

about 3 years ago

In reply to: Andres Freund (#120)

Re: Add 64-bit XIDs into PostgreSQL 15

Andres Freund <andres@anarazel.de> writes:

On 2022-11-21 14:21:35 -0500, Tom Lane wrote:

pg_resetwal does seem like a better, more useful home for this; it'd
allow you to adjust these numbers after initial creation which might be
useful. I'm not sure how flexible it is right now in terms of where
you can set the new values, but that can always be improved.

IIRC the respective pg_resetwal parameters are really hard to use for
something like this, because they don't actually create the respective
SLRU segments. We of course could fix that.

Is that still true? We should fix it, for sure.

regards, tom lane

#122

Andres Freund

andres@anarazel.de

about 3 years ago

In reply to: Tom Lane (#121)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On 2022-11-21 15:16:46 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2022-11-21 14:21:35 -0500, Tom Lane wrote:

pg_resetwal does seem like a better, more useful home for this; it'd
allow you to adjust these numbers after initial creation which might be
useful. I'm not sure how flexible it is right now in terms of where
you can set the new values, but that can always be improved.

IIRC the respective pg_resetwal parameters are really hard to use for
something like this, because they don't actually create the respective
SLRU segments. We of course could fix that.

Is that still true? We should fix it, for sure.

Sure looks that way to me. I think it might mostly work if you manage to
find nextXid, nextMulti, nextMultiOffset values that each point to the
start of a segment that'd then be created whenever those values are
used.

Greetings,

Andres Freund

#123

Chris Travers

chris@orioledata.com

about 3 years ago

In reply to: Aleksander Alekseev (#117)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 21, 2022 at 12:25 PM Aleksander Alekseev <
aleksander@timescale.com> wrote:

Hi hackers,

I have a very serious concern about the current patch set. as someone

who has faced transaction id wraparound in the past.

[...]

I had a similar stance when I started working on this patch. Of
course, it seemed horrible just to postpone the consequences of
inadequate monitoring, too long running transactions that prevent
aggressive autovacuum etc. So I can understand your point.

With time I've got to a little bit of another view of this feature i.e.

1. It's important to correctly set monitoring, the cut-off of long
transactions, etc. anyway. It's not the responsibility of vacuum
before wraparound to report inadequate monitoring etc. Furthermore, in
real life, this will be already too late if it prevents 32-bit
wraparound and invokes much downtime in an unexpected moment of time
if it occurs already. (The rough analogy for that is the machine
running at 120mph turns every control off and applies full brakes just
because the cooling liquid is low (of course there might be a warning
previously, but anyway))

2. The checks and handlers for the event that is never expected in the
cluster lifetime (~200 years at constant rate of 1e6 TPS) can be just
dropped. Of course we still need to do automatic routine maintenance
like cutting SLRU buffers (but with a much bigger interval if we have
much disk space e.g.). But I considered that we either can not care
what will be with cluster after > 200 years (it will be migrated many
times before this, on many reasons not related to Postgres even for
the most conservative owners). So the radical proposal is to drop
64-bit wraparound at all. The most moderate one is just not taking
very much care that after 200 years we have more hassle than next
month if we haven't set up everything correctly. Next month's pain
will be more significant even if it teaches dba something.

Big thanks for your view on the general implementation of this feature,

anyway.

I'm inclined to agree with Pavel on this one. Keeping 32-bit XIDs in
order to intentionally trigger XID wraparound to indicate the ending
disk space and/or misconfigured system (by the time when it's usually
too late anyway) is a somewhat arguable perspective. It would be great
to notify the user about the potential issues with the configuration
and/or the fact that VACUUM doesn't catch up. But it doesn't mean we
should keep 32-bit XIDs in order to achive this.

That's not what I am suggesting. However I am saying that removing a
symptom of a problem so you get bit when you are in an even worse position
is a bad idea, and I would strenuously oppose including a patchset like
this without some sort of mitigating measures.

What I think should be added to address this concern is a GUC variable of
warn_max_xid_lag, and then change the logic which warns of impending xid
wraparound to logic that warns of xid lag reaching a value in excess of
this threshold.

Databases under load take a long time to correct problems and throwing new
problems onto DBA-land and saying "you figure out your monitoring" is not
something I want to support. Again, this comes from experience facing xid
wraparound issues. I have nothing against 64-bit xids. But I think the
patch ought not to delay onset of visible symptoms, that instead the focus
should be on making sure that efforts to address those symptoms can be
handled using less extreme measures and at a bit of a more relaxed pace.

Show quoted text

--
Best regards,
Aleksander Alekseev

#124

Chris Travers

chris@orioledata.com

about 3 years ago

In reply to: Pavel Borisov (#116)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 21, 2022 at 10:40 AM Pavel Borisov <pashkin.elfe@gmail.com>
wrote:

I have a very serious concern about the current patch set. as someone

who has faced transaction id wraparound in the past.

I can start by saying I think it would be helpful (if the other issues

are approached reasonably) to have 64-bit xids, but there is an important
piece of context in reventing xid wraparounds that seems missing from this
patch unless I missed something.

XID wraparound is a symptom, not an underlying problem. It usually

occurs when autovacuum or other vacuum strategies have unexpected stalls
and therefore fail to work as expected. Shifting to 64-bit XIDs
dramatically changes the sorts of problems that these stalls are likely to
pose to operational teams. -- you can find you are running out of storage
rather than facing an imminent database shutdown. Worse, this patch delays
the problem until some (possibly far later!) time, when vacuum will take
far longer to finish, and options for resolving the problem are
diminished. As a result I am concerned that merely changing xids from
32-bit to 64-bit will lead to a smaller number of far more serious outages.

What would make a big difference from my perspective would be to combine

this with an inverse system for warning that there is a problem, allowing
the administrator to throw warnings about xids since last vacuum, with a
configurable threshold. We could have this at two billion by default as
that would pose operational warnings not much later than we have now.

Otherwise I can imagine cases where instead of 30 hours to vacuum a

table, it takes 300 hours on a database that is short on space. And I
would not want to be facing such a situation.

Hi, Chris!
I had a similar stance when I started working on this patch. Of
course, it seemed horrible just to postpone the consequences of
inadequate monitoring, too long running transactions that prevent
aggressive autovacuum etc. So I can understand your point.

With time I've got to a little bit of another view of this feature i.e.

1. It's important to correctly set monitoring, the cut-off of long
transactions, etc. anyway. It's not the responsibility of vacuum
before wraparound to report inadequate monitoring etc. Furthermore, in
real life, this will be already too late if it prevents 32-bit
wraparound and invokes much downtime in an unexpected moment of time
if it occurs already. (The rough analogy for that is the machine
running at 120mph turns every control off and applies full brakes just
because the cooling liquid is low (of course there might be a warning
previously, but anyway))

So I disagree with you on a few critical points here.

Right now the way things work is:
1. Database starts throwing warnings that xid wraparound is approaching
2. Database-owning team initiates an emergency response, may take downtime
or degradation of services as a result
3. People get frustrated with PostgreSQL because this is a reliability
problem.

What I am worried about is:
1. Database is running out of space
2. Database-owning team initiates an emergency response and takes more
downtime to into a good spot
3. People get frustrated with PostgreSQL because this is a reliability
problem.

If that's the way we go, I don't think we've solved that much. And as
humans we also bias our judgments towards newsworthy events, so rarer, more
severe problems are a larger perceived problem than the more routine, less
severe problems. So I think our image as a reliable database would suffer.

An ideal resolution from my perspective would be:
1. Database starts throwing warnings that xid lag has reached severely
abnormal levels
2. Database owning team initiates an effort to correct this, and does not
take downtime or degradation of services as a result
3. People do not get frustrated because this is not a reliability problem
anymore.

Now, 64-big xids are necessary to get us there but they are not
sufficient. One needs to fix the way we handle this sort of problem.
There is existing logic to warn if we are approaching xid wraparound. This
should be changed to check how many xids we have used rather than remaining
and have a sensible default there (optionally configurable).

I agree it is not vacuum's responsibility. It is the responsibility of the
current warnings we have to avoid more serious problems arising from this
change. These should just be adjusted rather than dropped.

Show quoted text

2. The checks and handlers for the event that is never expected in the
cluster lifetime (~200 years at constant rate of 1e6 TPS) can be just
dropped. Of course we still need to do automatic routine maintenance
like cutting SLRU buffers (but with a much bigger interval if we have
much disk space e.g.). But I considered that we either can not care
what will be with cluster after > 200 years (it will be migrated many
times before this, on many reasons not related to Postgres even for
the most conservative owners). So the radical proposal is to drop
64-bit wraparound at all. The most moderate one is just not taking
very much care that after 200 years we have more hassle than next
month if we haven't set up everything correctly. Next month's pain
will be more significant even if it teaches dba something.

Big thanks for your view on the general implementation of this feature,
anyway.

Kind regards,
Pavel Borisov.
Supabase

#125

Aleksander Alekseev

aleksander@timescale.com

about 3 years ago

In reply to: Chris Travers (#124)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Chris,

Right now the way things work is:
1. Database starts throwing warnings that xid wraparound is approaching
2. Database-owning team initiates an emergency response, may take downtime or degradation of services as a result
3. People get frustrated with PostgreSQL because this is a reliability problem.

What I am worried about is:
1. Database is running out of space
2. Database-owning team initiates an emergency response and takes more downtime to into a good spot
3. People get frustrated with PostgreSQL because this is a reliability problem.

If that's the way we go, I don't think we've solved that much. And as humans we also bias our judgments towards newsworthy events, so rarer, more severe problems are a larger perceived problem than the more routine, less severe problems. So I think our image as a reliable database would suffer.

An ideal resolution from my perspective would be:
1. Database starts throwing warnings that xid lag has reached severely abnormal levels
2. Database owning team initiates an effort to correct this, and does not take downtime or degradation of services as a result
3. People do not get frustrated because this is not a reliability problem anymore.

Now, 64-big xids are necessary to get us there but they are not sufficient. One needs to fix the way we handle this sort of problem. There is existing logic to warn if we are approaching xid wraparound. This should be changed to check how many xids we have used rather than remaining and have a sensible default there (optionally configurable).

I agree it is not vacuum's responsibility. It is the responsibility of the current warnings we have to avoid more serious problems arising from this change. These should just be adjusted rather than dropped.

I disagree with the axiom that XID wraparound is merely a symptom and
not a problem.

Using 32-bit XIDs was a reasonable design decision back when disk
space was limited and disks were slow. The drawback of this approach
is the need to do the wraparound but agaig back then it was a
reasonable design choice. If XIDs were 64-bit from the beginning users
could run one billion (1,000,000,000) TPS for 584 years without a
wraparound. We wouldn't have it similarly as there is no wraparound
for WAL segments. Now when disks are much faster and much cheaper
32-bit XIDs are almost certainly not a good design choice anymore.
(Especially considering the fact that this particular patch mitigates
the problem of increased disk consumption greatly.)

Also I disagree with an argument that a DBA that doesn't monitor disk
space would care much about some strange warnings in the logs. If a
DBA doesn't monitor basic system metrics I'm afraid we can't help this
person much.

I do agree that we could probably provide some additional help for the
rest of the users when it comes to configuring VACUUM. This is indeed
non-trivial. However I don't think this is in scope of this particular
patchset. I suggest we keep the focus in this discussion. If you have
a concrete proposal please consider starting a new thread.

This at least is my personal opinion. Let's give the rest of the
community a chance to share their thoughts.

--
Best regards,
Aleksander Alekseev

#126

Pavel Borisov

pashkin.elfe@gmail.com

about 3 years ago

In reply to: Aleksander Alekseev (#125)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, Alexander!

On Tue, 22 Nov 2022 at 13:01, Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi Chris,

Right now the way things work is:
1. Database starts throwing warnings that xid wraparound is approaching
2. Database-owning team initiates an emergency response, may take downtime or degradation of services as a result
3. People get frustrated with PostgreSQL because this is a reliability problem.

What I am worried about is:
1. Database is running out of space
2. Database-owning team initiates an emergency response and takes more downtime to into a good spot
3. People get frustrated with PostgreSQL because this is a reliability problem.

If that's the way we go, I don't think we've solved that much. And as humans we also bias our judgments towards newsworthy events, so rarer, more severe problems are a larger perceived problem than the more routine, less severe problems. So I think our image as a reliable database would suffer.

An ideal resolution from my perspective would be:
1. Database starts throwing warnings that xid lag has reached severely abnormal levels
2. Database owning team initiates an effort to correct this, and does not take downtime or degradation of services as a result
3. People do not get frustrated because this is not a reliability problem anymore.

Now, 64-big xids are necessary to get us there but they are not sufficient. One needs to fix the way we handle this sort of problem. There is existing logic to warn if we are approaching xid wraparound. This should be changed to check how many xids we have used rather than remaining and have a sensible default there (optionally configurable).

I agree it is not vacuum's responsibility. It is the responsibility of the current warnings we have to avoid more serious problems arising from this change. These should just be adjusted rather than dropped.

I disagree with the axiom that XID wraparound is merely a symptom and
not a problem.

Using 32-bit XIDs was a reasonable design decision back when disk
space was limited and disks were slow. The drawback of this approach
is the need to do the wraparound but agaig back then it was a
reasonable design choice. If XIDs were 64-bit from the beginning users
could run one billion (1,000,000,000) TPS for 584 years without a
wraparound. We wouldn't have it similarly as there is no wraparound
for WAL segments. Now when disks are much faster and much cheaper
32-bit XIDs are almost certainly not a good design choice anymore.
(Especially considering the fact that this particular patch mitigates
the problem of increased disk consumption greatly.)

Also I disagree with an argument that a DBA that doesn't monitor disk
space would care much about some strange warnings in the logs. If a
DBA doesn't monitor basic system metrics I'm afraid we can't help this
person much.

I do agree that we could probably provide some additional help for the
rest of the users when it comes to configuring VACUUM. This is indeed
non-trivial. However I don't think this is in scope of this particular
patchset. I suggest we keep the focus in this discussion. If you have
a concrete proposal please consider starting a new thread.

This at least is my personal opinion. Let's give the rest of the
community a chance to share their thoughts.

I agree with Alexander, that notifications for DBA are a little bit
outside the scope of the activity in this thread unless we've just
dropped some existing notifications, considering they're not
significant anymore. If that was the point, please Chris mention what
existing notifications you want to return. I don't think it's a big
deal to have the patch with certain notifications inherited from
Master branch.

Kind regards,
Pavel Borisov
Supabase.

#127

Aleksander Alekseev

aleksander@timescale.com

about 3 years ago

In reply to: Pavel Borisov (#126)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

[ Excluding my personal e-mail from cc:, not sure how it got there.
Please don't cc: to afiskon@gmail.com, I'm not using it for reading
pgsql-hackers@. ]

I agree with Alexander, that notifications for DBA are a little bit
outside the scope of the activity in this thread unless we've just
dropped some existing notifications, considering they're not
significant anymore. If that was the point, please Chris mention what
existing notifications you want to return. I don't think it's a big
deal to have the patch with certain notifications inherited from
Master branch.

To clarify a bit: currently we DO notify the user about the upcoming
wraparound point [1]https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND:

"""
If for some reason autovacuum fails to clear old XIDs from a table,
the system will begin to emit warning messages like this when the
database's oldest XIDs reach forty million transactions from the
wraparound point:

WARNING: database "mydb" must be vacuumed within 39985967 transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in
that database.
"""

So I'm not sure how the notification Chris proposes should differ or
why it is in scope of this patch. If the point was to make sure
certain existing notifications will be preserved - sure, why not.

[1]: https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

--
Best regards,
Aleksander Alekseev

#128

Dilip Kumar

dilipbalaut@gmail.com

about 3 years ago

In reply to: Aleksander Alekseev (#127)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Nov 22, 2022 at 7:44 PM Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi hackers,

[ Excluding my personal e-mail from cc:, not sure how it got there.
Please don't cc: to afiskon@gmail.com, I'm not using it for reading
pgsql-hackers@. ]

I agree with Alexander, that notifications for DBA are a little bit
outside the scope of the activity in this thread unless we've just
dropped some existing notifications, considering they're not
significant anymore. If that was the point, please Chris mention what
existing notifications you want to return. I don't think it's a big
deal to have the patch with certain notifications inherited from
Master branch.

To clarify a bit: currently we DO notify the user about the upcoming
wraparound point [1]:

"""
If for some reason autovacuum fails to clear old XIDs from a table,
the system will begin to emit warning messages like this when the
database's oldest XIDs reach forty million transactions from the
wraparound point:

WARNING: database "mydb" must be vacuumed within 39985967 transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in
that database.
"""

So I'm not sure how the notification Chris proposes should differ or
why it is in scope of this patch. If the point was to make sure
certain existing notifications will be preserved - sure, why not.

IMHO, after having 64-bit XID this WARNING doesn't really make sense.
Those warnings exist because those limits were problematic for 32-bit
xid but now it is not so I think we should not have such warnings.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#129

Chris Travers

chris@orioledata.com

about 3 years ago

In reply to: Aleksander Alekseev (#125)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Nov 22, 2022 at 10:01 AM Aleksander Alekseev <
aleksander@timescale.com> wrote:

Hi Chris,

Right now the way things work is:
1. Database starts throwing warnings that xid wraparound is approaching
2. Database-owning team initiates an emergency response, may take

downtime or degradation of services as a result

3. People get frustrated with PostgreSQL because this is a reliability

problem.

What I am worried about is:
1. Database is running out of space
2. Database-owning team initiates an emergency response and takes more

downtime to into a good spot

3. People get frustrated with PostgreSQL because this is a reliability

problem.

If that's the way we go, I don't think we've solved that much. And as

humans we also bias our judgments towards newsworthy events, so rarer, more
severe problems are a larger perceived problem than the more routine, less
severe problems. So I think our image as a reliable database would suffer.

An ideal resolution from my perspective would be:
1. Database starts throwing warnings that xid lag has reached severely

abnormal levels

2. Database owning team initiates an effort to correct this, and does

not take downtime or degradation of services as a result

3. People do not get frustrated because this is not a reliability

problem anymore.

Now, 64-big xids are necessary to get us there but they are not

sufficient. One needs to fix the way we handle this sort of problem.
There is existing logic to warn if we are approaching xid wraparound. This
should be changed to check how many xids we have used rather than remaining
and have a sensible default there (optionally configurable).

I agree it is not vacuum's responsibility. It is the responsibility of

the current warnings we have to avoid more serious problems arising from
this change. These should just be adjusted rather than dropped.

I disagree with the axiom that XID wraparound is merely a symptom and
not a problem.

XID wraparound doesn't happen to healthy databases, nor does it happen to
databases actively monitoring this possibility. The cases where it
happens, two circumstances are present:

1. Autovacuum is stalled, and
2. Monitoring is not checking for xid lag (which would be fixed by
autovacuum if it were running properly).

XID wraparound is downstream of those problems. At least that is my
experience. If you disagree, I would like to hear why.

Additionally those problems still will cause worse outages with this change
unless there are some mitigating measures in place. If you don't like my
proposal, I would be open to other mitigating measures. But I think there
need to be mitigating measures in a change like this.

Using 32-bit XIDs was a reasonable design decision back when disk
space was limited and disks were slow. The drawback of this approach
is the need to do the wraparound but agaig back then it was a
reasonable design choice. If XIDs were 64-bit from the beginning users
could run one billion (1,000,000,000) TPS for 584 years without a
wraparound. We wouldn't have it similarly as there is no wraparound
for WAL segments. Now when disks are much faster and much cheaper
32-bit XIDs are almost certainly not a good design choice anymore.
(Especially considering the fact that this particular patch mitigates
the problem of increased disk consumption greatly.)

I agree that 64-bit xids are a good idea. I just don't think that existing
safety measures should be ignored or reverted.

Also I disagree with an argument that a DBA that doesn't monitor disk
space would care much about some strange warnings in the logs. If a
DBA doesn't monitor basic system metrics I'm afraid we can't help this
person much.

The problem isn't just the lack of disk space, but the difficulty that
stuck autovacuum runs pose in resolving the issue. Keep in mind that
everything you can do to reclaim disk space (vacuum full, cluster,
pg_repack) will be significantly slowed down by an extremely bloated
table/index comparison. The problem is that if you are running out of disk
space, and your time to recovery much longer than expected, then you have a
major problem. It's not just one or the other, but the combination that
poses the real risk here.

Now that's fine if you want to run a bloatless table engine but to my
knowledge none of these are production-ready yet. ZHeap seems mostly
stalled. Oriole is still experimental. But with the current PostgreSQL
table structure.

A DBA can monitor disk space, but if the DBA is not also monitoring xid
lag, then by the time corrective action is taken it may be too late.

I do agree that we could probably provide some additional help for the
rest of the users when it comes to configuring VACUUM. This is indeed
non-trivial. However I don't think this is in scope of this particular
patchset. I suggest we keep the focus in this discussion. If you have
a concrete proposal please consider starting a new thread.

This at least is my personal opinion. Let's give the rest of the
community a chance to share their thoughts.

Fair enough. As I say, my proposal that this needs mitigating measures
here comes from my experience with xid wraparound and vacuum runs that took
36hrs+ to run. At present my objection stands, and I hope the committers
take that into account.

Show quoted text

--
Best regards,
Aleksander Alekseevh

#130

Aleksander Alekseev

aleksander@timescale.com

about 3 years ago

In reply to: Chris Travers (#129)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Chris,

XID wraparound doesn't happen to healthy databases
If you disagree, I would like to hear why.

Consider the case when you run a slow OLAP query that takes 12h to
complete and 100K TPS of fast OLTP-type queries on the same system.
The fast queries will consume all 32-bit XIDs in less than 12 hours,
while the OLAP query started 12 hours ago didn't finish yet and thus
its tuples can't be frozen.

I agree that 64-bit xids are a good idea. I just don't think that existing safety measures should be ignored or reverted.

Fair enough.

The problem isn't just the lack of disk space, but the difficulty that stuck autovacuum runs pose in resolving the issue. Keep in mind that everything you can do to reclaim disk space (vacuum full, cluster, pg_repack) will be significantly slowed down by an extremely bloated table/index comparison. The problem is that if you are running out of disk space, and your time to recovery much longer than expected, then you have a major problem. It's not just one or the other, but the combination that poses the real risk here.

Now that's fine if you want to run a bloatless table engine but to my knowledge none of these are production-ready yet. ZHeap seems mostly stalled. Oriole is still experimental. But with the current PostgreSQL table structure.

A DBA can monitor disk space, but if the DBA is not also monitoring xid lag, then by the time corrective action is taken it may be too late.

Good point but I still don't think this is related to the XID
wraparound problem.

--
Best regards,
Aleksander Alekseev

#131

Simon Riggs

simon.riggs@enterprisedb.com

about 3 years ago

In reply to: Maxim Orlov (#108)

Re: Add 64-bit XIDs into PostgreSQL 15

On Fri, 21 Oct 2022 at 17:09, Maxim Orlov <orlovmg@gmail.com> wrote:

Reviews and opinions are very welcome!

I'm wondering whether the safest way to handle this is by creating a
new TAM called "heap64", so that all storage changes happens there.
(Obviously there are still many other changes in core, but they are
more easily fixed).

That would reduce the code churn around "heap", allowing us to keep it
stable while we move to the brave new world.

Many current users see stability as one of the greatest strengths of
Postgres, so while I very much support this move, I wonder if this
gives us a way to have both stability and innovation at the same time?

--
Simon Riggs http://www.EnterpriseDB.com/

#132

Peter Geoghegan

pg@bowt.ie

about 3 years ago

In reply to: Chris Travers (#115)

Re: Add 64-bit XIDs into PostgreSQL 15

On Sun, Nov 20, 2022 at 11:58 PM Chris Travers <chris.travers@gmail.com> wrote:

I can start by saying I think it would be helpful (if the other issues are approached reasonably) to have 64-bit xids, but there is an important piece of context in reventing xid wraparounds that seems missing from this patch unless I missed something.

XID wraparound is a symptom, not an underlying problem. It usually occurs when autovacuum or other vacuum strategies have unexpected stalls and therefore fail to work as expected. Shifting to 64-bit XIDs dramatically changes the sorts of problems that these stalls are likely to pose to operational teams. -- you can find you are running out of storage rather than facing an imminent database shutdown. Worse, this patch delays the problem until some (possibly far later!) time, when vacuum will take far longer to finish, and options for resolving the problem are diminished. As a result I am concerned that merely changing xids from 32-bit to 64-bit will lead to a smaller number of far more serious outages.

This is exactly what I think (except perhaps for the part about having
fewer outages overall). The more transaction ID space you need, the
more space you're likely to need in the near future.

We can all agree that having more runway is better than having less
runway, at least in some abstract sense, but that in itself doesn't
help the patch series very much. The first time the system-level
oldestXid (or database level datminfrozenxid) attains an age of 2
billion XIDs will usually *also* be the first time it attains an age
of (say) 300 million XIDs. Even 300 million is usually a huge amount
of XID space relative to (say) the number of XIDs used every 24 hours.
So I know exactly what you mean about just addressing a symptom.

The whole project seems to just ignore basic, pertinent questions.
Questions like: why are we falling behind like this in the first
place? And: If we don't catch up soon, why should we be able to catch
up later on? Falling behind on freezing is still a huge problem with
64-bit XIDs.

Part of the problem with the current design is that table age has
approximately zero relationship with the true cost of catching up on
freezing -- we are "using the wrong units", in a very real sense. In
general we may have to do zero freezing to advance a table's
relfrozenxid age by a billion XIDs, or we might have to write
terabytes of FREEZE_PAGE records to advance a similar looking table's
relfrozenxid by just one single XID (it could also swing wildly over
time for the same table). Which the system simply doesn't try to
reason about right now.

There are no settings for freezing that use physical units, and frame
the problem as a problem of being behind by this many unfrozen pages
(they are all based on XID age). And so the problem with letting table
age get into the billions isn't even that we'll never catch up -- we
actually might catch up very easily! The real problem is that we have
no way of knowing ahead of time (at least not right now). VACUUM
should be managing the debt, *and* the uncertainty about how much debt
we're really in. VACUUM needs to have a dynamic, probabilistic
understanding of what's really going on -- something much more
sophisticated than looking at table age in autovacuum.c.

One reason why you might want to advance relfrozenxid proactively is
to give the system a better general sense of the true relationship
between logical XID space and physical freezing for a given table and
workload -- it gives a clearer picture about the conditions in the
table. The relationship between SLRU space and physical heap pages and
the work of freezing is made somewhat clearer by a more proactive
approach to advancing relfrozenxid. That's one way that VACUUM can
lower the uncertainty I referred to.

--
Peter Geoghegan

#133

Aleksander Alekseev

aleksander@timescale.com

about 3 years ago

In reply to: Peter Geoghegan (#132)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi hackers,

I'm wondering whether the safest way to handle this is by creating a
new TAM called "heap64", so that all storage changes happens there.

Many current users see stability as one of the greatest strengths of
Postgres, so while I very much support this move, I wonder if this
gives us a way to have both stability and innovation at the same time?

That would be nice.

However from what I see TransactionId is a type used globally in
PostgresSQL. It is part of structures used by TAM interface, used in
WAL records, etc. So we will have to learn these components to work
with 64-bit XIDs anyway and then start thinking about cases like: when
a user runs a transaction affecting two tables, a heap32 one and
heap64 one and we will have to figure out which tuples are visible and
which are not. This perhaps is doable but the maintenance burden for
the project will be too high IMO.

It seems to me that the best option we can offer for the users looking
for stability is to use the latest PostgreSQL version with 32-bit
XIDs. Assuming these users care that much about this particular design
choice of course.

The whole project seems to just ignore basic, pertinent questions.
Questions like: why are we falling behind like this in the first
place? And: If we don't catch up soon, why should we be able to catch
up later on? Falling behind on freezing is still a huge problem with
64-bit XIDs.

Is the example I provided above wrong?

"""
Consider the case when you run a slow OLAP query that takes 12h to
complete and 100K TPS of fast OLTP-type queries on the same system.
The fast queries will consume all 32-bit XIDs in less than 12 hours,
while the OLAP query started 12 hours ago didn't finish yet and thus
its tuples can't be frozen.
"""

If it is, please let me know. I would very much like to know if my
understanding here is flawed.

--
Best regards,
Aleksander Alekseev

#134

Chris Travers

chris@orioledata.com

about 3 years ago

In reply to: Aleksander Alekseev (#133)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi;

Trying to discuss where we are talking past eachother.....

On Fri, Nov 25, 2022 at 9:38 AM Aleksander Alekseev <
aleksander@timescale.com> wrote:

Hi hackers,

I'm wondering whether the safest way to handle this is by creating a
new TAM called "heap64", so that all storage changes happens there.

Many current users see stability as one of the greatest strengths of
Postgres, so while I very much support this move, I wonder if this
gives us a way to have both stability and innovation at the same time?

That would be nice.

However from what I see TransactionId is a type used globally in
PostgresSQL. It is part of structures used by TAM interface, used in
WAL records, etc. So we will have to learn these components to work
with 64-bit XIDs anyway and then start thinking about cases like: when
a user runs a transaction affecting two tables, a heap32 one and
heap64 one and we will have to figure out which tuples are visible and
which are not. This perhaps is doable but the maintenance burden for
the project will be too high IMO.

It seems to me that the best option we can offer for the users looking
for stability is to use the latest PostgreSQL version with 32-bit
XIDs. Assuming these users care that much about this particular design
choice of course.

I didn't see any changes to pg_upgrade to make this change possible on
upgrade. Is that also outside of the scope of your patch set? If so how
is that continuity supposed to be ensured?

Also related to that, I think you would have to have a check on streaming
replication that both instances use the same xid format (that you don't
accidently upgrade this somehow), since this is set per db cluster, right?

The whole project seems to just ignore basic, pertinent questions.
Questions like: why are we falling behind like this in the first
place? And: If we don't catch up soon, why should we be able to catch
up later on? Falling behind on freezing is still a huge problem with
64-bit XIDs.

Is the example I provided above wrong?

"""
Consider the case when you run a slow OLAP query that takes 12h to
complete and 100K TPS of fast OLTP-type queries on the same system.
The fast queries will consume all 32-bit XIDs in less than 12 hours,
while the OLAP query started 12 hours ago didn't finish yet and thus
its tuples can't be frozen.
"""

If it is, please let me know. I would very much like to know if my
understanding here is flawed.

So, you have described a scenario we cannot support today (because xids
would be exhausted within 5.5 hours at that transactional rate).
Additionally as PostgreSQL becomes more capable, this sort of scale will
increasingly be within reach and that is an important point in favor of
this effort.

This being said, there is another set of xid wraparound cases which today
is much larger in number that I think would be hurt if this patchset were
to be accepted into Postgres without mitigating measures which you consider
out of bounds -- the cases like Mailchimp, Adjust, and the like. This is
why I keep stressing this, and I don't think waiving away concerns about
use cases outside of the one you are focusing on is helpful, particularly
from those of us who have faced xid wraparounds in these cases in the
past. In these cases, database teams are usually faced with an operational
emergency while tools like vacuum, pg_repack, etc are severely degraded due
to getting so far behind on freezing. The deeper the hole, the harder it
will be to dig out of.

Every large-scale high-throughput database I have ever worked on had
long-running query alerts precisely because of the impact on vacuum and the
downstream performance impacts. I would love to get to a point where this
wasn't necessary and maybe in a few specific workloads we might be there
very soon. The effort you are engaging in here is an important part of the
path to get there, but let's not forget the people who today are facing xid
wraparounds due to vacuum problems and what this sort of set of changes
will mean for them.

Show quoted text

Best regards,
Aleksander Alekseev

#135

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Chris Travers (#134)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

There are quite some time after previous update.
Here is a rebased version of the patch set.

--
Best regards,
Maxim Orlov.

#136

Robert Haas

robertmhaas@gmail.com

about 3 years ago

In reply to: Chris Travers (#134)

Re: Add 64-bit XIDs into PostgreSQL 15

On Sat, Nov 26, 2022 at 4:08 AM Chris Travers <chris@orioledata.com> wrote:

I didn't see any changes to pg_upgrade to make this change possible on upgrade. Is that also outside of the scope of your patch set? If so how is that continuity supposed to be ensured?

The scheme is documented in their 0006 patch, in a README.XID file.
I'm not entirely confident that it's the best design and have argued
against it in the past, but it's not crazy.

More generally, while I think there's plenty of stuff to be concerned
about in this patch set and while I'm somewhat skeptical about the
likelihood of its getting or staying committed, I can't really
understand your concerns in particular. The thrust of your concern
seems to be that if we allow people to get further behind, recovery
will be more difficult. I'm not sure I see the problem. Suppose that
we adopt this proposal and that it is bug-free. Now, consider a user
who gets 8 billion XIDs behind. They probably have to vacuum pretty
much every page in the database to do that, or least every page in the
tables that haven't been vacuumed recently. But that would likely also
be true if they were 800 million XIDs behind, as is possible today.
The effort to catch up doesn't increase linearly with how far behind
you are, and is always bounded by the DB size.

It is true that if the table is progressively bloating, it is likely
to be more bloated by the time you are 8 billion XIDs behind than it
was when you were 800 million XIDs behind. I don't see that as a very
good reason not to adopt this patch, because you can bloat the table
by an arbitrarily large amount while consuming only a small number of
XiDs, even just 1 XID. Protecting against bloat is good, but shutting
down the database when the XID age reaches a certain value is not a
particularly effective way of doing that, so saying that we'll be
hurting people by not shutting down the database at the point where we
do so today doesn't ring true to me. I think that most people who get
to the point of wraparound shutdown have workloads where bloat isn't a
huge issue, because those who do start having problems with the bloat
way before they run out of XIDs.

It would be entirely possible to add a parameter to the system that
says "hey, you know we can keep running even if we're a shazillion
XiDs behind, but instead shut down when we are behind by this number
of XIDs." Then, if somebody wants to force an automatic shutdown at
that point, they could, and I think that then the scenario you're
worried about just can't happen any more . But isn't that a little bit
silly? You could also just monitor how far behind you are and page the
DBA when you get behind by more than a certain number of XIDs. Then,
you wouldn't be risking a shutdown, and you'd still be able to stay on
top of the XID ages of your tables.

Philosophically, I disagree with the idea of shutting down the
database completely in any situation in which a reasonable alternative
exists. Losing read and write availability is really bad, and I don't
think it's what users want. I think that most users want the database
to degrade gracefully when things do not go according to plan.
Ideally, they'd like everything to Just Work, but reasonable users
understand that sometimes there are going to be problems, and in my
experience, what makes them happy is when the database acts to contain
the scope of the problem so that it affects their workload as little
as possible, rather than acting to magnify the problem so that it
impacts their workload as much as possible. This patch, implementation
and design concerns to one side, does that.

I don't believe there's a single right answer to the question of what
to do about vacuum falling behind, and I think it's worth exploring
multiple avenues to improve the situation. You can have vacuum never
run on a table at all, say because all of the workers are busy
elsewhere, or because the table is locked until the heat death of the
universe. You can have vacuum run on a table but too slowly to do any
good, because of the vacuum cost delay mechanism. You can have vacuum
run and finish but do little good because of prepared transactions or
replication slots or long-running queries. It's reasonable to think
about what kinds of steps might help in those different scenarios, and
especially to think about what kind of steps might help in multiple
cases. We should do that. But, I don't think any of that means that we
can ignore the need for some kind of expansion of the XID space
forever. Computers are getting faster. It's already possible to burn
through the XID space in hours, and the number of hours is going to go
down over time and maybe eventually the right unit will be minutes, or
even seconds. Sometime before then, we need to do something to make
the runway bigger, or else just give up on PostgreSQL being a relevant
piece of software.

Perhaps the thing we need to do is not exactly this, but if not, it's
probably a sibling or cousin of this.

--
Robert Haas
EDB: http://www.enterprisedb.com

#137

Peter Geoghegan

pg@bowt.ie

about 3 years ago

In reply to: Robert Haas (#136)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 8:53 AM Robert Haas <robertmhaas@gmail.com> wrote:

It is true that if the table is progressively bloating, it is likely
to be more bloated by the time you are 8 billion XIDs behind than it
was when you were 800 million XIDs behind. I don't see that as a very
good reason not to adopt this patch, because you can bloat the table
by an arbitrarily large amount while consuming only a small number of
XiDs, even just 1 XID. Protecting against bloat is good, but shutting
down the database when the XID age reaches a certain value is not a
particularly effective way of doing that, so saying that we'll be
hurting people by not shutting down the database at the point where we
do so today doesn't ring true to me.

I can't speak for Chris, but I think that almost everybody will agree
on this much, without really having to think about it. It's easy to
see that having more XID space is, in general, strictly a good thing.
If there was a low risk way of getting that benefit, then I'd be in
favor of it.

Here's the problem that I see with this patch: I don't think that the
risks are commensurate with the benefits. I can imagine being in favor
of an even more invasive patch that (say) totally removes the concept
of freezing, but that would have to be a very different sort of
design.

Philosophically, I disagree with the idea of shutting down the
database completely in any situation in which a reasonable alternative
exists. Losing read and write availability is really bad, and I don't
think it's what users want.

At a certain point it may make more sense to activate XidStopLimit
protections (which will only prevent new XID allocations) instead of
getting further behind on freezing, even in a world where we're never
strictly obligated to activate XidStopLimit. It may in fact be the
lesser evil, even with 64-bit XIDs -- because we still have to freeze,
and the factors that drive when and how we freeze mostly aren't
changed.

Fundamentally, when we're falling behind on freezing, at a certain
point we can expect to keep falling behind -- unless some kind of
major shift happens. That's just how freezing works, with or without
64-bit XIDs/MXIDs. If VACUUM isn't keeping up with the allocation of
transactions, then the system is probably misconfigured in some way.
We should do our best to signal this as early and as frequently as
possible, and we should mitigate specific hazards (e.g. old temp
tables) if at all possible. We should activate the failsafe when
things really start to look dicey (which, incidentally, the patch just
removes). These mitigations may be very effective, but in the final
analysis they don't address the fundamental weakness in freezing.

Granted, the specifics of the current XidStopLimit mechanism are
unlikely to directly carry over to 64-bit XIDs. XidStopLimit is
structured in a way that doesn't actually consider freeze debt in
units like unfrozen pages. Like Chris, I just don't see why the patch
obviates the need for something like XidStopLimit, since the patch
doesn't remove freezing. An improved XidStopLimit mechanism might even
end up kicking in *before* the oldest relfrozenxid reached 2 billion
XIDs, depending on the specifics.

Removing the failsafe mechanism seems misguided to me for similar
reasons. I recently learned that Amazon RDS has set a *lower*
vacuum_failsafe_age default than the standard default (its default of
1.6 billion to only 1.2 billion on RDS). This decision predates my
joining AWS. It seems as if practical experience has shown that
allowing any table's age(relfrozenxid) to get too far past a billion
is not a good idea. At least it's not a good idea on modern Postgres
versions, that have the freeze map.

We really shouldn't have to rely on having billions of XIDs available
in the first place -- XID space isn't really a fungible commodity.
It's much more important to recognize that something (some specific
thing) just isn't working as designed, which in general could be
pretty far removed from freezing. For example, index corruption could
do it (at least without the failsafe). Some kind of autovacuum
starvation could do it. It's almost always more complicated than "not
enough available XID space".

--
Peter Geoghegan

#138

Robert Haas

robertmhaas@gmail.com

about 3 years ago

In reply to: Peter Geoghegan (#137)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 4:09 PM Peter Geoghegan <pg@bowt.ie> wrote:

Granted, the specifics of the current XidStopLimit mechanism are
unlikely to directly carry over to 64-bit XIDs. XidStopLimit is
structured in a way that doesn't actually consider freeze debt in
units like unfrozen pages. Like Chris, I just don't see why the patch
obviates the need for something like XidStopLimit, since the patch
doesn't remove freezing. An improved XidStopLimit mechanism might even
end up kicking in *before* the oldest relfrozenxid reached 2 billion
XIDs, depending on the specifics.

What is the purpose of using 64-bit XIDs, if not to avoid having to
stop the world when we run short of XIDs?

I'd say that if this patch, or any patch with broadly similar goals,
fails to remove xidStopLimit, it might as well not exist.

xidStopLimit is not a general defense against falling too far behind
on freezing, and in general, there is no reason to think that we need
such a defense. xidStopLimit is a defense against reusing the same XID
and thus causing irreversible database corruption. When that
possibility no longer exists, it has outlived its usefulness and we
should be absolutely delighted to bid it adieu.

It seems like you and Chris are proposing the moral equivalent of
paying off your home loan but still sending a check to the mortgage
company every month just to be sure they don't get mad.

--
Robert Haas
EDB: http://www.enterprisedb.com

#139

Bruce Momjian

bruce@momjian.us

about 3 years ago

In reply to: Robert Haas (#138)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 04:30:22PM -0500, Robert Haas wrote:

What is the purpose of using 64-bit XIDs, if not to avoid having to
stop the world when we run short of XIDs?

I'd say that if this patch, or any patch with broadly similar goals,
fails to remove xidStopLimit, it might as well not exist.

xidStopLimit is not a general defense against falling too far behind
on freezing, and in general, there is no reason to think that we need
such a defense. xidStopLimit is a defense against reusing the same XID
and thus causing irreversible database corruption. When that
possibility no longer exists, it has outlived its usefulness and we
should be absolutely delighted to bid it adieu.

It seems like you and Chris are proposing the moral equivalent of
paying off your home loan but still sending a check to the mortgage
company every month just to be sure they don't get mad.

I think the problem is that we still have bloat with 64-bit XIDs,
specifically pg_xact and pg_multixact files. Yes, that bloat is less
serious, but it is still an issue worth reporting in the server logs,
though not serious enough to stop the server from write queries.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#140

Peter Geoghegan

pg@bowt.ie

about 3 years ago

In reply to: Robert Haas (#138)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 1:30 PM Robert Haas <robertmhaas@gmail.com> wrote:

What is the purpose of using 64-bit XIDs, if not to avoid having to
stop the world when we run short of XIDs?

I agree that the xidStopLimit mechanism was designed with the specific
goal of preventing "true" physical XID wraparound that results in
wrong answers to queries. It was not written with the intention of
limiting the accumulation of freeze debt, which would have to use
units like unfrozen heap pages to make any sense. That is a historic
fact -- no question.

I think that it is nevertheless quite possible that just refusing to
allocate any more XIDs could make sense with 64-bit XIDs, where we
don't strictly have to to keep the system fully functional. That might
still be the lesser evil, in that other world. The cutoff itself would
depend on many workload details, I suppose.

Imagine if we actually had 64-bit XIDs -- let's assume for a moment
that it's a done deal. This raises a somewhat awkward question: do you
just let the system get further and further behind on freezing,
forever? We can all agree that 2 billion XIDs is very likely the wrong
time to start refusing new XIDs -- because it just isn't designed with
debt in mind. But what's the right time, if any? How much debt is too
much?

At the very least these seem like questions that deserve serious consideration.

I'd say that if this patch, or any patch with broadly similar goals,
fails to remove xidStopLimit, it might as well not exist.

Well, it could in principle be driven by lots of different kinds of
information, and make better decisions by actually targeting freeze
debt in some way. An "enhanced version of xidStopLimit with 64-bit
XIDs" could kick in far far later than it currently would. Obviously
that has some value.

I'm not claiming to know how to build this "better xidStopLimit
mechanism", by the way. I'm not seriously proposing it. Mostly I'm
just saying that the question "where do you draw the line if not at 2
billion XIDs?" is a very pertinent question. It is not a question that
is made any less valid by the fact that we already know that 2 billion
XIDs is pretty far from optimal in almost all cases. Having some limit
based on something is likely to be more effective than having no limit
based on nothing.

Admittedly this argument works a lot better with the failsafe than it
does with xidStopLimit. Both are removed by the patch.

--
Peter Geoghegan

#141

Peter Geoghegan

pg@bowt.ie

about 3 years ago

In reply to: Bruce Momjian (#139)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 1:52 PM Bruce Momjian <bruce@momjian.us> wrote:

I think the problem is that we still have bloat with 64-bit XIDs,
specifically pg_xact and pg_multixact files. Yes, that bloat is less
serious, but it is still an issue worth reporting in the server logs,
though not serious enough to stop the server from write queries.

That's definitely a big part of it.

Again, I don't believe that the idea is fundamentally without merit.
Just that it's not worth it, given that having more XID space is very
much not something that I think fixes most of the problems. And given
the real risk of serious bugs with something this invasive.

I believe that it would be more useful to focus on just not getting
into trouble in the first place, as well as on mitigating specific
problems that lead to the system reaching xidStopLimit in practice. I
don't think that there is any good reason to allow datfrozenxid to go
past about a billion. When it does the interesting questions are
questions about what went wrong, and how that specific failure can be
mitigated in a fairly direct way.

We've already used way to much "XID space runway", so why should using
even more help? It might, I suppose, but it almost seems like a
distraction to me, as somebody that wants to make things better for
users in general. As long as the system continues to misbehave (in
whatever way it happens to be misbehaving), why should any amount of
XID space ever be enough?

I think that we'll be able to get rid of freezing in a few years time.
But as long as we have freezing, we have these problems.

--
Peter Geoghegan

#142

Peter Geoghegan

pg@bowt.ie

about 3 years ago

In reply to: Peter Geoghegan (#140)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 1:52 PM Peter Geoghegan <pg@bowt.ie> wrote:

I'm not claiming to know how to build this "better xidStopLimit
mechanism", by the way. I'm not seriously proposing it. Mostly I'm
just saying that the question "where do you draw the line if not at 2
billion XIDs?" is a very pertinent question. It is not a question that
is made any less valid by the fact that we already know that 2 billion
XIDs is pretty far from optimal in almost all cases. Having some limit
based on something is likely to be more effective than having no limit
based on nothing.

Admittedly this argument works a lot better with the failsafe than it
does with xidStopLimit. Both are removed by the patch.

Come to think of it, if you were going to do something like this it
would probably work by throttling XID allocations, with a gradual
ramp-up. It would rarely get to the point that the system refused to
allocate XIDs completely.

It's not fundamentally unreasonable to force the application to live
within its means by throttling. Feedback that slows down the rate of
writes is much more common in the LSM tree world, within systems like
MyRocks [1]https://docs.google.com/presentation/d/1WgP-SlKay5AnSoVDSvOIzmu7edMmtYhdywoa0oAR4JQ/edit#slide=id.g8839c9d71b_0_79 -- Peter Geoghegan, where the risk of the system being destabilized by debt
is more pressing.

As I said, I don't think that this is a particularly promising way of
addressing problems with Postgres XID space exhaustion, since I
believe that the underlying issue isn't usually a simple lack of "XID
space slack capacity". But if you assume that I'm wrong here (if you
assume that we very often don't have the ability to freeze lazily
enough), then ISTM that throttling or feedback to stall new writes is
a very reasonable option. In fact, it's practically mandatory.

[1]: https://docs.google.com/presentation/d/1WgP-SlKay5AnSoVDSvOIzmu7edMmtYhdywoa0oAR4JQ/edit#slide=id.g8839c9d71b_0_79 -- Peter Geoghegan
--
Peter Geoghegan

#143

Chris Travers

chris@orioledata.com

about 3 years ago

In reply to: Robert Haas (#136)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi;

I suppose I must not have been clear in what I am suggesting we do and
why. I will try to answer specific points below and then restate what I
think the problem is, and what I think should be done about it.

On Mon, Nov 28, 2022 at 5:53 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Nov 26, 2022 at 4:08 AM Chris Travers <chris@orioledata.com>
wrote:

I didn't see any changes to pg_upgrade to make this change possible on

upgrade. Is that also outside of the scope of your patch set? If so how
is that continuity supposed to be ensured?

The scheme is documented in their 0006 patch, in a README.XID file.
I'm not entirely confident that it's the best design and have argued
against it in the past, but it's not crazy.

Right. Per previous discussion I thought there was some discussion of
allowing people to run with the existing behavior. I must have been
mistaken. If that is off the table then pg_upgrade and runtime replication
checks don't matter.

More generally, while I think there's plenty of stuff to be concerned
about in this patch set and while I'm somewhat skeptical about the
likelihood of its getting or staying committed, I can't really
understand your concerns in particular. The thrust of your concern
seems to be that if we allow people to get further behind, recovery
will be more difficult. I'm not sure I see the problem. Suppose that
we adopt this proposal and that it is bug-free. Now, consider a user
who gets 8 billion XIDs behind. They probably have to vacuum pretty
much every page in the database to do that, or least every page in the
tables that haven't been vacuumed recently. But that would likely also
be true if they were 800 million XIDs behind, as is possible today.
The effort to catch up doesn't increase linearly with how far behind
you are, and is always bounded by the DB size.

Right. I agree with all of that.

It is true that if the table is progressively bloating, it is likely
to be more bloated by the time you are 8 billion XIDs behind than it
was when you were 800 million XIDs behind. I don't see that as a very
good reason not to adopt this patch, because you can bloat the table
by an arbitrarily large amount while consuming only a small number of
XiDs, even just 1 XID. Protecting against bloat is good, but shutting
down the database when the XID age reaches a certain value is not a
particularly effective way of doing that, so saying that we'll be
hurting people by not shutting down the database at the point where we
do so today doesn't ring true to me. I think that most people who get
to the point of wraparound shutdown have workloads where bloat isn't a
huge issue, because those who do start having problems with the bloat
way before they run out of XIDs.

To be clear, I never suggested shutting down the database. What I have
suggested is that repurposing the current approaching-xid-wraparound
warnings to start complaining loudly when a threshold is exceeded would be
helpful. I think it makes sense to make that threshold configurable
especially if we eventually have people running bloat-free table structures.

There are two fundamental problems here. The first is that if, as you say,
a table is progressively bloating and we are getting further and further
behind on vacuuming and freezing, something is seriously wrong and we need
to do something about it. In these cases, I my experience is that
vacuuming and related tools tend to suffer degraded performance, and
determining how to solve the problem takes quite a bit more time than a
routine bloat issue would. So what I am arguing against is treating the
problem just as a bloat issue. If you get there due to vacuum being slow,
something else is wrong and you are probably going to have to find and fix
that as well in order to catch up. At least that's my experience.

I don't object to the db continuing to run, allocate xids etc. What I
object to is it doing so in silently where things are almost certainly
going very wrong.

It would be entirely possible to add a parameter to the system that
says "hey, you know we can keep running even if we're a shazillion
XiDs behind, but instead shut down when we are behind by this number
of XIDs." Then, if somebody wants to force an automatic shutdown at
that point, they could, and I think that then the scenario you're
worried about just can't happen any more . But isn't that a little bit
silly? You could also just monitor how far behind you are and page the
DBA when you get behind by more than a certain number of XIDs. Then,
you wouldn't be risking a shutdown, and you'd still be able to stay on
top of the XID ages of your tables.

Philosophically, I disagree with the idea of shutting down the
database completely in any situation in which a reasonable alternative
exists. Losing read and write availability is really bad, and I don't
think it's what users want. I think that most users want the database
to degrade gracefully when things do not go according to plan.
Ideally, they'd like everything to Just Work, but reasonable users
understand that sometimes there are going to be problems, and in my
experience, what makes them happy is when the database acts to contain
the scope of the problem so that it affects their workload as little
as possible, rather than acting to magnify the problem so that it
impacts their workload as much as possible. This patch, implementation
and design concerns to one side, does that.

I don't believe there's a single right answer to the question of what
to do about vacuum falling behind, and I think it's worth exploring
multiple avenues to improve the situation. You can have vacuum never
run on a table at all, say because all of the workers are busy
elsewhere, or because the table is locked until the heat death of the
universe. You can have vacuum run on a table but too slowly to do any
good, because of the vacuum cost delay mechanism. You can have vacuum
run and finish but do little good because of prepared transactions or
replication slots or long-running queries. It's reasonable to think
about what kinds of steps might help in those different scenarios, and
especially to think about what kind of steps might help in multiple
cases. We should do that. But, I don't think any of that means that we
can ignore the need for some kind of expansion of the XID space
forever. Computers are getting faster. It's already possible to burn
through the XID space in hours, and the number of hours is going to go
down over time and maybe eventually the right unit will be minutes, or
even seconds. Sometime before then, we need to do something to make
the runway bigger, or else just give up on PostgreSQL being a relevant
piece of software.

Perhaps the thing we need to do is not exactly this, but if not, it's
probably a sibling or cousin of this.

To be clear, I am not opposed to doing this. I just think there is a small
missing piece which would avoid operational nightmares.

Show quoted text

--
Robert Haas
EDB: http://www.enterprisedb.com

#144

Chris Travers

chris@orioledata.com

about 3 years ago

In reply to: Peter Geoghegan (#141)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 11:06 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, Nov 28, 2022 at 1:52 PM Bruce Momjian <bruce@momjian.us> wrote:

I think the problem is that we still have bloat with 64-bit XIDs,
specifically pg_xact and pg_multixact files. Yes, that bloat is less
serious, but it is still an issue worth reporting in the server logs,
though not serious enough to stop the server from write queries.

That's definitely a big part of it.

Again, I don't believe that the idea is fundamentally without merit.
Just that it's not worth it, given that having more XID space is very
much not something that I think fixes most of the problems. And given
the real risk of serious bugs with something this invasive.

I believe that it would be more useful to focus on just not getting
into trouble in the first place, as well as on mitigating specific
problems that lead to the system reaching xidStopLimit in practice. I
don't think that there is any good reason to allow datfrozenxid to go
past about a billion. When it does the interesting questions are
questions about what went wrong, and how that specific failure can be
mitigated in a fairly direct way.

We've already used way to much "XID space runway", so why should using
even more help? It might, I suppose, but it almost seems like a
distraction to me, as somebody that wants to make things better for
users in general. As long as the system continues to misbehave (in
whatever way it happens to be misbehaving), why should any amount of
XID space ever be enough?

So I think the problem is that PostgreSQL is becoming more and more
scalabile, hardware is becoming more capable, and certain use cases are
continuing to scale up. Over time, we tend to find ourselves approaching
the end of the runway at ever higher velocities. That's a problem that
will get significantly worse over time.

Of course, as I think we agree, the priorities should be (in order):
1. Avoid trouble
2. Recover from trouble early
3. Provide more and better options for recovery.

I think 64bit xids are a very good idea, but they really fit in this bottom
tier. Not being up against mathematical limits to the software when
things are going bad is certainly a good thing. But I am really worried
about the attitude that this patch really avoids trouble because in many
cases, I don;t think it does and therefore I believe we need to make sure
we are not reducing visibility of underlying problems.

Show quoted text

I think that we'll be able to get rid of freezing in a few years time.
But as long as we have freezing, we have these problems.

--
Peter Geoghegan

#145

Bruce Momjian

bruce@momjian.us

about 3 years ago

In reply to: Chris Travers (#144)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Nov 29, 2022 at 02:35:20PM +0100, Chris Travers wrote:

So I think the problem is that PostgreSQL is becoming more and more scalabile,
hardware is becoming more capable, and certain use cases are continuing to
scale up.Â Over time, we tend to find ourselves approaching the end of the
runway at ever higher velocities.Â That's a problem that will get significantly
worse over time.

Of course, as I think we agree, the priorities should be (in order):
1.Â Avoid trouble
2.Â Recover from trouble early
3.Â Provide more and better options for recovery.

Warn about trouble is another area we should focus on here.

I think 64bit xids are a very good idea, but they really fit in this bottom
tier.Â Â Not being up against mathematical limits to the software when things
are going bad is certainly a good thing.Â But I am really worried about the
attitude that this patch really avoids trouble because in many cases, I don;t
think it does and therefore I believe we need to make sure we are not reducing
visibility of underlying problems.

As far as I know, all our freeze values are focused on avoiding XID
wraparound. If XID wraparound is no longer an issue, we might find that
our freeze limits can be much higher than they are now.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#146

Jan Wieck

jan@wi3ck.info

about 3 years ago

In reply to: Bruce Momjian (#145)

Re: Add 64-bit XIDs into PostgreSQL 15

On 11/29/22 09:46, Bruce Momjian wrote:

As far as I know, all our freeze values are focused on avoiding XID
wraparound. If XID wraparound is no longer an issue, we might find that
our freeze limits can be much higher than they are now.

I'd be careful in that direction as the values together with maintenance
work mem also keep a lid on excessive index cleanup rounds.

Regards, Jan

#147

Robert Haas

robertmhaas@gmail.com

about 3 years ago

In reply to: Peter Geoghegan (#140)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, Nov 28, 2022 at 4:53 PM Peter Geoghegan <pg@bowt.ie> wrote:

Imagine if we actually had 64-bit XIDs -- let's assume for a moment
that it's a done deal. This raises a somewhat awkward question: do you
just let the system get further and further behind on freezing,
forever? We can all agree that 2 billion XIDs is very likely the wrong
time to start refusing new XIDs -- because it just isn't designed with
debt in mind. But what's the right time, if any? How much debt is too
much?

I simply don't see a reason to ever stop the server entirely. I don't
even agree with the idea of slowing down XID allocation, let alone
refusing it completely. When the range of allocated XIDs become too
large, several bad things happen. First, we become unable to allocate
new XIDs without corrupting the database. Second, pg_clog and other
SLRUs become uncomfortably large. There may be some other things too
that I'm not thinking about. But these things are not all equally bad.
If these were medical problems, being unable to allocate new XIDs
without data corruption would be a heart attack, and SLRUs getting
bigger on disk would be acne. You don't handle problems of such wildly
differing severity in the same way. When someone is having a heart
attack, an ambulance rushes them to the hospital, running red lights
as necessary. When someone has acne, you don't take them to the same
hospital in the same ambulance and drive it at a slower rate of speed.
You do something else entirely, and it's something that is in every
way much less dramatic. There's no such thing as an attack of acne
that's so bad that it requires an ambulance ride, but even a mild
heart attack should result in a fast trip to the ER. So here. The two
problems are so qualitatively different that the responses should also
be qualitatively different.

Admittedly this argument works a lot better with the failsafe than it
does with xidStopLimit. Both are removed by the patch.

I don't think the failsafe stuff should be removed, but it should
probably be modified in some way. Running out of XIDs is the only
valid reason for stopping the world, at least IMO, but it is
definitely NOT the only reason for vacuuming more aggressively.

--
Robert Haas
EDB: http://www.enterprisedb.com

#148

Robert Haas

robertmhaas@gmail.com

about 3 years ago

In reply to: Chris Travers (#143)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Nov 29, 2022 at 8:03 AM Chris Travers <chris@orioledata.com> wrote:

To be clear, I never suggested shutting down the database. What I have suggested is that repurposing the current approaching-xid-wraparound warnings to start complaining loudly when a threshold is exceeded would be helpful. I think it makes sense to make that threshold configurable especially if we eventually have people running bloat-free table structures.

There are two fundamental problems here. The first is that if, as you say, a table is progressively bloating and we are getting further and further behind on vacuuming and freezing, something is seriously wrong and we need to do something about it. In these cases, I my experience is that vacuuming and related tools tend to suffer degraded performance, and determining how to solve the problem takes quite a bit more time than a routine bloat issue would. So what I am arguing against is treating the problem just as a bloat issue. If you get there due to vacuum being slow, something else is wrong and you are probably going to have to find and fix that as well in order to catch up. At least that's my experience.

I don't object to the db continuing to run, allocate xids etc. What I object to is it doing so in silently where things are almost certainly going very wrong.

OK. My feeling is that the set of things we can do to warn the user is
somewhat limited. I'm open to trying our best, but we need to have
reasonable expectations. Sophisticated users will be monitoring for
problems even if we do nothing to warn, and dumb ones won't look at
their logs. Any feature that proposes to warn must aim at the uses who
are smart enough to check the logs but dumb enough not to have any
more sophisticated monitoring. Such users certainly exist and are not
even uncommon, but they aren't the only kind by a long shot.

My argument is that removing xidStopLimit is totally fine, because it
only serves to stop the database. What to do about xidWarnLimit is a
slightly more complex question. Certainly it can't be left untouched,
because warning that we're about to shut down the database for lack of
allocatable XIDs is not sensible if there is no such lack and we
aren't going to shut it down. But I'm also not sure if the model is
right. Doing nothing for a long time and then warning in every
transaction when some threshold is crossed is an extreme behavior
change. Right now that's somewhat justified because we're about to hit
a brick wall at full speed, but if we remove the brick wall and
replace it with a gentle pelting with rotten eggs, it's unclear that a
similarly strenuous reaction is the right model. But that's also not
to say that we should do nothing at all.

--
Robert Haas
EDB: http://www.enterprisedb.com

#149

Bruce Momjian

bruce@momjian.us

about 3 years ago

In reply to: Robert Haas (#148)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Nov 29, 2022 at 11:41:07AM -0500, Robert Haas wrote:

My argument is that removing xidStopLimit is totally fine, because it
only serves to stop the database. What to do about xidWarnLimit is a
slightly more complex question. Certainly it can't be left untouched,
because warning that we're about to shut down the database for lack of
allocatable XIDs is not sensible if there is no such lack and we
aren't going to shut it down. But I'm also not sure if the model is
right. Doing nothing for a long time and then warning in every
transaction when some threshold is crossed is an extreme behavior
change. Right now that's somewhat justified because we're about to hit
a brick wall at full speed, but if we remove the brick wall and
replace it with a gentle pelting with rotten eggs, it's unclear that a
similarly strenuous reaction is the right model. But that's also not
to say that we should do nothing at all.

Yeah, we would probably need to warn on every 1 million transactions or
something.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#150

Greg Stark

stark@mit.edu

about 3 years ago

In reply to: Peter Geoghegan (#140)

Re: Add 64-bit XIDs into PostgreSQL 15

On Mon, 28 Nov 2022 at 16:53, Peter Geoghegan <pg@bowt.ie> wrote:

Imagine if we actually had 64-bit XIDs -- let's assume for a moment
that it's a done deal. This raises a somewhat awkward question: do you
just let the system get further and further behind on freezing,
forever? We can all agree that 2 billion XIDs is very likely the wrong
time to start refusing new XIDs -- because it just isn't designed with
debt in mind. But what's the right time, if any? How much debt is too
much?

My first thought was... why not? Just let the system get further and
further behind on freezing. Where's the harm?

Picture an insert-only database that is receiving data very quickly
never having data deleted or modified. vacuum takes several days to
complete and the system wraps 32-bit xid several times a day.

The DBA asks you why are they even bothering running vacuum? They have
plenty of storage for clog, latency on selects is not a pain point,
not compared to running multi-day vacuums that impact insert times....

That isn't far off the scenario where I've seen wraparound being a
pain btw. Anti-wraparound vacuum took about 2 days and was kicking off
pretty much as soon as the previous one finished. For a table that was
mostly read-only.

Of course to make the judgement the DBA needs to have good ways to
measure the space usage of clog, and the overhead caused by clog
lookups that could be avoided. Then they can judge for themselves how
much freezing is appropriate.

--
greg

#151

Chris Travers

chris@orioledata.com

about 3 years ago

In reply to: Bruce Momjian (#149)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Nov 29, 2022 at 5:57 PM Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Nov 29, 2022 at 11:41:07AM -0500, Robert Haas wrote:

My argument is that removing xidStopLimit is totally fine, because it
only serves to stop the database. What to do about xidWarnLimit is a
slightly more complex question. Certainly it can't be left untouched,
because warning that we're about to shut down the database for lack of
allocatable XIDs is not sensible if there is no such lack and we
aren't going to shut it down. But I'm also not sure if the model is
right. Doing nothing for a long time and then warning in every
transaction when some threshold is crossed is an extreme behavior
change. Right now that's somewhat justified because we're about to hit
a brick wall at full speed, but if we remove the brick wall and
replace it with a gentle pelting with rotten eggs, it's unclear that a
similarly strenuous reaction is the right model. But that's also not
to say that we should do nothing at all.

Yeah, we would probably need to warn on every 1 million transactions or
something.

My proposal would be to make the threshold configurable and start warning
on every transaction after that. There are a couple reasons to do that.

The first is that noisy warnings are extremely easy to see. You get them
in cron emails, from psql, in the db logs etc. Having them every million
makes them harder to catch.

The point here is not to ensure there are no problems, but to make sure
that an existing layer in the current swiss cheese model of safety doesn't
go away. Will it stop all problems? No. But the current warning strategy
is effective, given how many times we hear of cases of people having to
take drastic action to avoid impending xid wraparound.

If someone has an insert only database and maye doesn't want to ever
freeze, they can set the threshold to -1 or something. I would suggest
keeping the default as at 2 billion to be in line with existing limitations
and practices. People can then adjust as they see fit.

Warning text might be something like "XID Lag Threshold Exceeded. Is
autovacuum clearing space and keeping up?"

Show quoted text

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#152

Robert Haas

robertmhaas@gmail.com

about 3 years ago

In reply to: Chris Travers (#151)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, Nov 29, 2022 at 9:35 PM Chris Travers <chris@orioledata.com> wrote:

My proposal would be to make the threshold configurable and start warning on every transaction after that. There are a couple reasons to do that.

The first is that noisy warnings are extremely easy to see. You get them in cron emails, from psql, in the db logs etc. Having them every million makes them harder to catch.

The point here is not to ensure there are no problems, but to make sure that an existing layer in the current swiss cheese model of safety doesn't go away. Will it stop all problems? No. But the current warning strategy is effective, given how many times we hear of cases of people having to take drastic action to avoid impending xid wraparound.

If someone has an insert only database and maye doesn't want to ever freeze, they can set the threshold to -1 or something. I would suggest keeping the default as at 2 billion to be in line with existing limitations and practices. People can then adjust as they see fit.

Warning text might be something like "XID Lag Threshold Exceeded. Is autovacuum clearing space and keeping up?"

None of this seems unreasonable to me. If we want to allow more
configurability, we could also let you choose the threshold and the
frequency of warnings (every N transactions).

But, I think we might be getting down a little bit in the weeds. It's
not clear that everybody's on board with the proposed page format
changes. I'm not completely opposed, but I'm also not wild about the
approach. It's probably not a good idea to spend all of our energy
debating the details of how to reform xidWrapLimit without having some
consensus on those points. It is, in a word, bikeshedding: on-disk
page format changes are hard, but everyone understands warning
messages.

Lest we miss the forest for the trees, there is an aspect of this
patch that I find to be an extremely good idea and think we should try
to get committed even if the rest of the patch set ends up in the
rubbish bin. Specifically, there are a couple of patches in here that
have to do with making SLRUs indexed by 64-bit integers rather than by
32-bit integers. We've had repeated bugs in the area of handling SLRU
wraparound in the past, some of which have caused data loss. Just by
chance, I ran across a situation just yesterday where an SLRU wrapped
around on disk for reasons that I don't really understand yet and
chaos ensued. Switching to an indexing system for SLRUs that does not
ever wrap around would probably enable us to get rid of a whole bunch
of crufty code, and would also likely improve the general reliability
of the system in situations where wraparound is threatened. It seems
like a really, really good idea.

I haven't checked the patches to see whether they look correct, and
I'm concerned in particular about upgrade scenarios. But if there's a
way we can get that part committed, I think it would be a clear win.

--
Robert Haas
EDB: http://www.enterprisedb.com

#153

Peter Geoghegan

pg@bowt.ie

about 3 years ago

In reply to: Robert Haas (#152)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Nov 30, 2022 at 8:13 AM Robert Haas <robertmhaas@gmail.com> wrote:

I haven't checked the patches to see whether they look correct, and
I'm concerned in particular about upgrade scenarios. But if there's a
way we can get that part committed, I think it would be a clear win.

--
Peter Geoghegan

#154

Pavel Borisov

pashkin.elfe@gmail.com

about 3 years ago

In reply to: Robert Haas (#152)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, Robert!

Lest we miss the forest for the trees, there is an aspect of this
patch that I find to be an extremely good idea and think we should try
to get committed even if the rest of the patch set ends up in the
rubbish bin. Specifically, there are a couple of patches in here that
have to do with making SLRUs indexed by 64-bit integers rather than by
32-bit integers. We've had repeated bugs in the area of handling SLRU
wraparound in the past, some of which have caused data loss. Just by
chance, I ran across a situation just yesterday where an SLRU wrapped
around on disk for reasons that I don't really understand yet and
chaos ensued. Switching to an indexing system for SLRUs that does not
ever wrap around would probably enable us to get rid of a whole bunch
of crufty code, and would also likely improve the general reliability
of the system in situations where wraparound is threatened. It seems
like a really, really good idea.

I totally support the idea that the part related to SLRU is worth
committing whether it is being the first step to 64xid or separately.
This subset is discussed in a separate thread [1]/messages/by-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A@mail.gmail.com. It seems that we
need more time to reach a consensus on the implementation of a whole
big thing. Just this discussion is a complicated thing and reveals
many different aspects concurrently in one thread.

So I'd vote for an evolutionary approach and give my +1 for
undertaking efforts to first committing [1]/messages/by-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A@mail.gmail.com to 16.

[1]: /messages/by-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A@mail.gmail.com

Kind regards,
Pavel Borisov,
Supabase.

#155

Maxim Orlov

orlovmg@gmail.com

about 3 years ago

In reply to: Pavel Borisov (#154)

Re: Add 64-bit XIDs into PostgreSQL 15

So I'd vote for an evolutionary approach and give my +1 for
undertaking efforts to first committing [1] to 16.

[1]:
/messages/by-id/CAFiTN-uudj2PY8GsUzFtLYFpBoq_rKegW3On_8ZHdxB1mVv3-A@mail.gmail.com

Kind regards,
Pavel Borisov,
Supabase.

+1 Totally support the idea. Let's focus on committing SLRU changes.

--
Best regards,
Maxim Orlov.

#156

Maxim Orlov

orlovmg@gmail.com

almost 3 years ago

In reply to: Maxim Orlov (#155)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

I want to make a quick summary here.

1. An overall consensus has been reached: we shall focus on committing SLRU
changes first.
2. I've created an appropriate patch set here [0]/messages/by-id/CACG=ezav34TL+fGXD5vJ48=QbQBL9BiwkOTWduu9yRqie-h+Dg@mail.gmail.com.
3. How [0]/messages/by-id/CACG=ezav34TL+fGXD5vJ48=QbQBL9BiwkOTWduu9yRqie-h+Dg@mail.gmail.com is waiting for a review. As always, all opinions will be welcome.
4. While discussing error/warning messages and some other stuff, this
thread was marked as "Waiting on Author".
5. I do rebase this patch set once in a week, but do not post it here,
since there is no need in it. See (1).
6. For now, I don't understand what changes I have to make here. So,
does "Waiting
on Author" is appropriate status here?

Anyway. Let's discuss on-disk page format, shall we?

AFAICS, we have a following options:
1. Making "true" 64–bit XIDs. I.e. making every tuple have 64–bit xmin and
xmax fields.
2. Put special in every page where base for XIDs are stored. This is what
we have done in the current patch set.
3. Put base for XIDs in a fork.
4. Make explicit 64–bit XIDs for concrete relations. I.e. CREATE TABLE foo
WITH (xid8) of smth.

There were opinions that the proposed solution (2) is not the optimal. It
would be great to hear your concerns and thoughts.

[0]: /messages/by-id/CACG=ezav34TL+fGXD5vJ48=QbQBL9BiwkOTWduu9yRqie-h+Dg@mail.gmail.com
/messages/by-id/CACG=ezav34TL+fGXD5vJ48=QbQBL9BiwkOTWduu9yRqie-h+Dg@mail.gmail.com

--
Best regards,
Maxim Orlov.

#157

adherent postgres

adherent_postgres@hotmail.com

almost 3 years ago

In reply to: Maxim Orlov (#156)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Maxim Orlov:

AFAICS, we have a following options:
1. Making "true" 64–bit XIDs. I.e. making every tuple have 64–bit xmin and >xmax fields.
2. Put special in every page where base for XIDs are stored. This is what we >have done in the current patch set.
3. Put base for XIDs in a fork.
4. Make explicit 64–bit XIDs for concrete relations. I.e. CREATE TABLE foo >WITH (xid8) of smth.

I think the first solution will not be agreed by the core committers, they will consider that the change is too big and will affect the stability of PostgreSQL,I think the second solution is actually quite good, and you've been working on it now,and there are successful cases (opengauss is implemented in this way,In order to save space and be compatible with older versions, opengauss design is to store the xmin/xmax of the head of the tuple in two parts, the xmin/xmax of the head of the tuple is the number of uint32; the header of the page stores the 64-bit xid_base, which is the xid_base of the current page.),I think it's best to stick to this solution now.
Opengauss tuple structure:
[cid:3fae289c-7f88-46be-a775-2d93b1a9c41e]
Best wish

________________________________
发件人: Maxim Orlov <orlovmg@gmail.com>
发送时间: 2022年12月28日 18:14
收件人: Pavel Borisov <pashkin.elfe@gmail.com>
抄送: Robert Haas <robertmhaas@gmail.com>; Chris Travers <chris@orioledata.com>; Bruce Momjian <bruce@momjian.us>; Aleksander Alekseev <aleksander@timescale.com>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>; Chris Travers <chris.travers@gmail.com>; Peter Geoghegan <pg@bowt.ie>; Fedor Sigaev <teodor@sigaev.ru>; Alexander Korotkov <aekorotkov@gmail.com>; Konstantin Knizhnik <knizhnik@garret.ru>; Nikita Glukhov <n.gluhov@postgrespro.ru>; Yura Sokolov <y.sokolov@postgrespro.ru>; Simon Riggs <simon.riggs@enterprisedb.com>
主题: Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

I want to make a quick summary here.

1. An overall consensus has been reached: we shall focus on committing SLRU changes first.
2. I've created an appropriate patch set here [0]/messages/by-id/CACG=ezav34TL+fGXD5vJ48=QbQBL9BiwkOTWduu9yRqie-h+Dg@mail.gmail.com.
3. How [0]/messages/by-id/CACG=ezav34TL+fGXD5vJ48=QbQBL9BiwkOTWduu9yRqie-h+Dg@mail.gmail.com is waiting for a review. As always, all opinions will be welcome.
4. While discussing error/warning messages and some other stuff, this thread was marked as "Waiting on Author".
5. I do rebase this patch set once in a week, but do not post it here, since there is no need in it. See (1).
6. For now, I don't understand what changes I have to make here. So, does "Waiting on Author" is appropriate status here?

Anyway. Let's discuss on-disk page format, shall we?

AFAICS, we have a following options:
1. Making "true" 64–bit XIDs. I.e. making every tuple have 64–bit xmin and xmax fields.
2. Put special in every page where base for XIDs are stored. This is what we have done in the current patch set.
3. Put base for XIDs in a fork.
4. Make explicit 64–bit XIDs for concrete relations. I.e. CREATE TABLE foo WITH (xid8) of smth.

There were opinions that the proposed solution (2) is not the optimal. It would be great to hear your concerns and thoughts.

[0]: /messages/by-id/CACG=ezav34TL+fGXD5vJ48=QbQBL9BiwkOTWduu9yRqie-h+Dg@mail.gmail.com

--
Best regards,
Maxim Orlov.

#158

Aleksander Alekseev

aleksander@timescale.com

almost 3 years ago

In reply to: Maxim Orlov (#156)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Maxim,

Anyway. Let's discuss on-disk page format, shall we?

Here are my two cents.

AFAICS, we have a following options:
[...]
2. Put special in every page where base for XIDs are stored. This is what we have done in the current patch set.

The approach of using special space IMO is fine. I'm still a bit
sceptical about the need to introduce a new entity "64-bit base XID"
while we already have 32-bit XID epochs that will do the job. I
suspect that having fewer entities helps to reason about the code and
that it is important in the long run, but maybe it's just me. In any
case, I don't have a strong opinion here.

Additionally, I think we should be clear about the long term goals. As
Peter G. pointed out above:

I think that we'll be able to get rid of freezing in a few years time.

IMO eventually getting rid of freezing and "magic" XIDs will simplify
the maintenance of the project and also make the behaviour of the
system much more predictable. The user will have to worry only about
the disk space reclamation.

If we have a consensus that this is the final goal then we should
definitely be moving toward 64-bit XIDs and perhaps even include a
corresponding PoC to the patchset. If we want to keep freezing
indefinitely then, as Chris Travers argued, 64-bit XIDs don't bring
that much value and maybe the community should be focusing on
something else.

--
Best regards,
Aleksander Alekseev

#159

Daniel Gustafsson

daniel@yesql.se

over 2 years ago

In reply to: Aleksander Alekseev (#158)

Re: Add 64-bit XIDs into PostgreSQL 15

This patch hasn't applied in quite some time, and the thread has moved to
discussing higher lever items rather than the suggested patch, so I'm closing
this as Returned with Feedback. Please feel free to resubmit when there is
renewed interest and a concensus on how/what to proceed with.

--
Daniel Gustafsson

#160

Aleksander Alekseev

aleksander@timescale.com

over 2 years ago

In reply to: Daniel Gustafsson (#159)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

This patch hasn't applied in quite some time, and the thread has moved to
discussing higher lever items rather than the suggested patch, so I'm closing
this as Returned with Feedback. Please feel free to resubmit when there is
renewed interest and a concensus on how/what to proceed with.

Yes, this thread awaits several other patches to be merged [1]https://commitfest.postgresql.org/43/3489/ in
order to continue, so it makes sense to mark it as RwF for the time
being. Thanks!

[1]: https://commitfest.postgresql.org/43/3489/

--
Best regards,
Aleksander Alekseev

#161

Maxim Orlov

orlovmg@gmail.com

almost 2 years ago

In reply to: Aleksander Alekseev (#160)

7 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi!

Just to keep this thread up to date, here's a new version after recent
changes in SLRU.
I'm also change order of the patches in the set, to make adding initdb MOX
options after the
"core 64 xid" patch, since MOX patch is unlikely to be committed and now
for test purpose only.

--
Best regards,
Maxim Orlov.

#162

wenhui qiu

qiuwenhuifx@gmail.com

almost 2 years ago

In reply to: Maxim Orlov (#161)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Maxim Orlov
Good news,xid64 has achieved a successful first phase,I tried to change
the path status (https://commitfest.postgresql.org/43/3594/) ,But it seems
incorrect

Maxim Orlov <orlovmg@gmail.com> 于2023年12月13日周三 20:26写道：

Show quoted text

Hi!

Just to keep this thread up to date, here's a new version after recent
changes in SLRU.
I'm also change order of the patches in the set, to make adding initdb MOX
options after the
"core 64 xid" patch, since MOX patch is unlikely to be committed and now
for test purpose only.

--
Best regards,
Maxim Orlov.

#163

Pavel Borisov

pashkin.elfe@gmail.com

almost 2 years ago

In reply to: wenhui qiu (#162)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi, Wenhui!

On Fri, 15 Dec 2023 at 05:52, wenhui qiu <qiuwenhuifx@gmail.com> wrote:

Hi Maxim Orlov
Good news,xid64 has achieved a successful first phase,I tried to
change the path status (https://commitfest.postgresql.org/43/3594/) ,But
it seems incorrect

Maxim Orlov <orlovmg@gmail.com> 于2023年12月13日周三 20:26写道：

Hi!

Just to keep this thread up to date, here's a new version after recent
changes in SLRU.
I'm also change order of the patches in the set, to make adding initdb
MOX options after the
"core 64 xid" patch, since MOX patch is unlikely to be committed and now
for test purpose only.

If the patch is RwF the CF entry is finished and can't be enabled, rather
the patch needs to be submitted in a new entry, which I have just done.
https://commitfest.postgresql.org/46/4703/

Please feel free to submit your review.

Kind regards,
Pavel Borisov,
Supabase

#164

wenhui qiu

qiuwenhuifx@gmail.com

almost 2 years ago

In reply to: Pavel Borisov (#163)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Pavel Borisov
Many thanks

Best whish

Pavel Borisov <pashkin.elfe@gmail.com> 于2023年12月15日周五 17:13写道：

Show quoted text

Hi, Wenhui!

On Fri, 15 Dec 2023 at 05:52, wenhui qiu <qiuwenhuifx@gmail.com> wrote:

Hi Maxim Orlov
Good news,xid64 has achieved a successful first phase,I tried to
change the path status (https://commitfest.postgresql.org/43/3594/) ,But
it seems incorrect

Maxim Orlov <orlovmg@gmail.com> 于2023年12月13日周三 20:26写道：

Hi!

Just to keep this thread up to date, here's a new version after recent
changes in SLRU.
I'm also change order of the patches in the set, to make adding initdb
MOX options after the
"core 64 xid" patch, since MOX patch is unlikely to be committed and now
for test purpose only.

If the patch is RwF the CF entry is finished and can't be enabled, rather
the patch needs to be submitted in a new entry, which I have just done.
https://commitfest.postgresql.org/46/4703/

Please feel free to submit your review.

Kind regards,
Pavel Borisov,
Supabase

#165

Shubham Khanna

khannashubham1197@gmail.com

almost 2 years ago

In reply to: Maxim Orlov (#161)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, Dec 13, 2023 at 5:56 PM Maxim Orlov <orlovmg@gmail.com> wrote:

Hi!

Just to keep this thread up to date, here's a new version after recent changes in SLRU.
I'm also change order of the patches in the set, to make adding initdb MOX options after the
"core 64 xid" patch, since MOX patch is unlikely to be committed and now for test purpose only.

I tried to apply the patch but it is failing at the Head. It is giving
the following error:
Hunk #1 succeeded at 601 (offset 5 lines).
patching file src/backend/replication/slot.c
patching file src/backend/replication/walreceiver.c
patching file src/backend/replication/walsender.c
Hunk #1 succeeded at 2434 (offset 160 lines).
patching file src/backend/storage/ipc/procarray.c
Hunk #1 succeeded at 1115 with fuzz 2.
Hunk #3 succeeded at 1286 with fuzz 2.
Hunk #7 FAILED at 4341.
Hunk #8 FAILED at 4899.
Hunk #9 FAILED at 4959.
3 out of 10 hunks FAILED -- saving rejects to file
src/backend/storage/ipc/procarray.c.rej
patching file src/backend/storage/ipc/standby.c
Hunk #1 FAILED at 1043.
Hunk #2 FAILED at 1370.
2 out of 2 hunks FAILED -- saving rejects to file
src/backend/storage/ipc/standby.c.rej
Please send the Re-base version of the patch.

Thanks and Regards,
Shubham Khanna.

#166

Aleksander Alekseev

aleksander@timescale.com

over 1 year ago

In reply to: Shubham Khanna (#165)

5 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

Please send the Re-base version of the patch.

PFA the rebased patchset.

In order to keep the scope reasonable I suggest we focus on
0001...0005 for now. 0006+ are difficult to rebase / review and I'm a
bit worried for the committer who will merge them. We can return to
0006+ when we deal with the first 5 patches. These patches can be
delivered to PG18 independently, as we did with SLRU in the PG17
cycle.

Tested on Intel MacOS w/ Autotools and ARM Linux w/ Meson.

--
Best regards,
Aleksander Alekseev

#167

Aleksander Alekseev

aleksander@timescale.com

over 1 year ago

In reply to: Aleksander Alekseev (#166)

5 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

Please send the Re-base version of the patch.

PFA the rebased patchset.

In order to keep the scope reasonable I suggest we focus on
0001...0005 for now. 0006+ are difficult to rebase / review and I'm a
bit worried for the committer who will merge them. We can return to
0006+ when we deal with the first 5 patches. These patches can be
delivered to PG18 independently, as we did with SLRU in the PG17
cycle.

Tested on Intel MacOS w/ Autotools and ARM Linux w/ Meson.

cfbot revealed a bug in 0004:

```
../src/backend/access/common/reloptions.c:1842:26: runtime error:
store to misaligned address 0x55a5c6d64d94 for type 'int64', which
requires 8 byte alignment
0x55a5c6d64d94: note: pointer points here
ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 ff ff ff ff
^
==18945==Using libbacktrace symbolizer.
#0 0x55a5c4a9b273 in fillRelOptions
../src/backend/access/common/reloptions.c:1842
#1 0x55a5c4a9c709 in build_reloptions
../src/backend/access/common/reloptions.c:2002
#2 0x55a5c4a9c739 in default_reloptions
../src/backend/access/common/reloptions.c:1957
#3 0x55a5c4a9ccc0 in heap_reloptions
../src/backend/access/common/reloptions.c:2109
#4 0x55a5c4da98d6 in DefineRelation ../src/backend/commands/tablecmds.c:858
#5 0x55a5c52549ac in ProcessUtilitySlow ../src/backend/tcop/utility.c:1164
#6 0x55a5c52545a2 in standard_ProcessUtility
../src/backend/tcop/utility.c:1067
#7 0x55a5c52546fd in ProcessUtility ../src/backend/tcop/utility.c:523
#8 0x55a5c524fe5b in PortalRunUtility ../src/backend/tcop/pquery.c:1158
#9 0x55a5c5250531 in PortalRunMulti ../src/backend/tcop/pquery.c:1315
#10 0x55a5c5250bd6 in PortalRun ../src/backend/tcop/pquery.c:791
#11 0x55a5c5249e44 in exec_simple_query ../src/backend/tcop/postgres.c:1274
#12 0x55a5c524cc07 in PostgresMain ../src/backend/tcop/postgres.c:4680
#13 0x55a5c524d18c in PostgresSingleUserMain
../src/backend/tcop/postgres.c:4136
#14 0x55a5c4f155e2 in main ../src/backend/main/main.c:194
#15 0x7fc3a77e5d09 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
#16 0x55a5c4a6a249 in _start
(/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres+0x8e1249)

Aborted (core dumped)
```

Here is the fix. It can be tested like this:

```
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -1839,6 +1839,7 @@ fillRelOptions(void *rdopts, Size basesize,
                             ((relopt_int *) options[i].gen)->default_val;
                         break;
                     case RELOPT_TYPE_INT64:
+                        Assert((((uint64)itempos) & 0x7) == 0);
                         *(int64 *) itempos = options[i].isset ?
                             options[i].values.int64_val :
                             ((relopt_int64 *) options[i].gen)->default_val;
```

--
Best regards,
Aleksander Alekseev

#168

Aleksander Alekseev

aleksander@timescale.com

over 1 year ago

In reply to: Aleksander Alekseev (#167)

5 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

Here is the fix. It can be tested like this:
[...]

PFA the rebased patchset.

--
Best regards,
Aleksander Alekseev

#169

Peter Eisentraut

peter@eisentraut.org

over 1 year ago

In reply to: Aleksander Alekseev (#168)

Re: Add 64-bit XIDs into PostgreSQL 15

On 23.07.24 11:13, Aleksander Alekseev wrote:

Here is the fix. It can be tested like this:
[...]

PFA the rebased patchset.

I'm wondering about the 64-bit GUCs.

At first, it makes sense that if there are settings that are counted in
terms of transactions, and transaction numbers are 64-bit integers, then
those settings should accept 64-bit integers.

But what is the purpose and effect of setting those parameters to such
huge numbers? For example, what is the usability of being able to set

vacuum_failsafe_age = 500000000000

I think in the world of 32-bit transaction IDs, you can intuitively
interpret most of these "transaction age" settings as "percent toward
disaster". For example,

vacuum_freeze_table_age = 150000000

is 7% toward disaster, and

vacuum_failsafe_age = 1600000000

is 75% toward disaster.

However, if there is no more disaster threshold at 2^31, what is the
guidance for setting these? Or more radically, why even run
transaction-count-based vacuum at all?

Conversely, if there is still some threshold (not disaster, but
efficiency or something else), would it still be useful to keep these
settings well below 2^31? In which case, we might not need 64-bit GUCs.

Your 0004 patch adds support for 64-bit GUCs but doesn't actually
convert any existing GUCs to use that. (Unlike the reloptions, which
your patch coverts.) And so there is no documentation about these
questions.

#170

Heikki Linnakangas

hlinnaka@iki.fi

over 1 year ago

In reply to: Peter Eisentraut (#169)

Re: Add 64-bit XIDs into PostgreSQL 15

On 25/07/2024 13:19, Peter Eisentraut wrote:

I'm wondering about the 64-bit GUCs.

At first, it makes sense that if there are settings that are counted in
terms of transactions, and transaction numbers are 64-bit integers, then
those settings should accept 64-bit integers.

But what is the purpose and effect of setting those parameters to such
huge numbers?Â For example, what is the usability of being able to set

vacuum_failsafe_age = 500000000000

I think in the world of 32-bit transaction IDs, you can intuitively
interpret most of these "transaction age" settings as "percent toward
disaster".Â For example,

vacuum_freeze_table_age = 150000000

is 7% toward disaster, and

vacuum_failsafe_age = 1600000000

is 75% toward disaster.

However, if there is no more disaster threshold at 2^31, what is the
guidance for setting these?Â Or more radically, why even run
transaction-count-based vacuum at all?

To allow the CLOG to be truncated. There's no disaster anymore, but
without freezing, the clog will grow indefinitely.

Conversely, if there is still some threshold (not disaster, but
efficiency or something else), would it still be useful to keep these
settings well below 2^31?Â In which case, we might not need 64-bit GUCs.

Yeah, I don't think it's critical. It makes sense to switch to 64 bit
GUCs, so that you can make those settings higher, but it's not critical
or strictly required for the rest of the work.

Another approach is to make the GUCs to mean "thousands of XIDs",
similar to how many of our memory settings are in kB rather than bytes.
that might be a super confusing change for existing settings though.

--
Heikki Linnakangas
Neon (https://neon.tech)

#171

Chris Travers

chris.travers@gmail.com

over 1 year ago

In reply to: Peter Eisentraut (#169)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, Jul 25, 2024 at 5:19 PM Peter Eisentraut <peter@eisentraut.org>
wrote:

On 23.07.24 11:13, Aleksander Alekseev wrote:

Here is the fix. It can be tested like this:
[...]

PFA the rebased patchset.

I'm wondering about the 64-bit GUCs.

At first, it makes sense that if there are settings that are counted in
terms of transactions, and transaction numbers are 64-bit integers, then
those settings should accept 64-bit integers.

But what is the purpose and effect of setting those parameters to such
huge numbers? For example, what is the usability of being able to set

vacuum_failsafe_age = 500000000000

Also in the rebased patch set I cannot find the above, so I cannot evaluate
what it does.

In the past I have pushed for some mechanism to produce warnings like we
currently have approaching xid wraparound when a certain threshold is met.
Is this that mechanism?

I think in the world of 32-bit transaction IDs, you can intuitively
interpret most of these "transaction age" settings as "percent toward
disaster". For example,

vacuum_freeze_table_age = 150000000

is 7% toward disaster, and

vacuum_failsafe_age = 1600000000

is 75% toward disaster.

However, if there is no more disaster threshold at 2^31, what is the
guidance for setting these? Or more radically, why even run
transaction-count-based vacuum at all?

Conversely, if there is still some threshold (not disaster, but
efficiency or something else), would it still be useful to keep these
settings well below 2^31? In which case, we might not need 64-bit GUCs.

Your 0004 patch adds support for 64-bit GUCs but doesn't actually
convert any existing GUCs to use that. (Unlike the reloptions, which
your patch coverts.) And so there is no documentation about these
questions.

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more

#172

Peter Eisentraut

peter@eisentraut.org

over 1 year ago

In reply to: Heikki Linnakangas (#170)

Re: Add 64-bit XIDs into PostgreSQL 15

On 25.07.24 13:09, Heikki Linnakangas wrote:

However, if there is no more disaster threshold at 2^31, what is the
guidance for setting these?Â Or more radically, why even run
transaction-count-based vacuum at all?

To allow the CLOG to be truncated. There's no disaster anymore, but
without freezing, the clog will grow indefinitely.

Maybe a setting similar to max_wal_size could be better for that?

#173

wenhui qiu

qiuwenhuifx@gmail.com

over 1 year ago

In reply to: Peter Eisentraut (#172)

Re: Add 64-bit XIDs into PostgreSQL 15

Maybe a setting similar to max_wal_size could be better for that?
+1

Thanks

Peter Eisentraut <peter@eisentraut.org> 于2024年7月25日周四 21:31写道：

Show quoted text

On 25.07.24 13:09, Heikki Linnakangas wrote:

However, if there is no more disaster threshold at 2^31, what is the
guidance for setting these? Or more radically, why even run
transaction-count-based vacuum at all?

To allow the CLOG to be truncated. There's no disaster anymore, but
without freezing, the clog will grow indefinitely.

Maybe a setting similar to max_wal_size could be better for that?

#174

Michael Paquier

michael@paquier.xyz

over 1 year ago

In reply to: Heikki Linnakangas (#170)

Re: Add 64-bit XIDs into PostgreSQL 15

On Thu, Jul 25, 2024 at 02:09:10PM +0300, Heikki Linnakangas wrote:

On 25/07/2024 13:19, Peter Eisentraut wrote:

Conversely, if there is still some threshold (not disaster, but
efficiency or something else), would it still be useful to keep these
settings well below 2^31? In which case, we might not need 64-bit GUCs.

Yeah, I don't think it's critical. It makes sense to switch to 64 bit GUCs,
so that you can make those settings higher, but it's not critical or
strictly required for the rest of the work.

It looks like we have some sort of consensus to introduce these GUC
APIs, then? I'd suggest to split that into its own thread because it
can be treated as an independent subject. (I know that this was
discussed previously, but it's been some time and this is going to
require a rebased version anyway, so..)

Another approach is to make the GUCs to mean "thousands of XIDs", similar to
how many of our memory settings are in kB rather than bytes. that might be a
super confusing change for existing settings though.

I find that a bit confusing as well, still that would be OK in terms
of compatibility as long as you enforce the presence of a unit and
leave the default behavior alone. So it does not sound that bad to
be, either. It is also a bit simpler to set for users. Less zeros to
deal with.

Aleksander Alekseev has proposed to remove short file names from SLRUs
to cimplify its internals, as of this thread, so that's one less topic
to deal with here:
/messages/by-id/CAJ7c6TOy7fUW9MuNeOWor3cSFnQg9tgz=mjXHDb94GORtM_Eyg@mail.gmail.com

I am unclear about the rest of the thread. Considering v55-0004 and
v55-0003 as two different topics, what should we do with the other
things? It does not seem to me that we have a clear consensus on if
we are going to do something about some epoch:xid -> 64b XID switch,
and what are the arguments in play here. The thread has been long, so
I may be just missing the details but a summary would be nice.
--
Michael

#175

Maxim Orlov

orlovmg@gmail.com

about 1 year ago

In reply to: Michael Paquier (#174)

Re: Add 64-bit XIDs into PostgreSQL 15

Apparently, the original thread will inevitably disintegrate into many
separate ones.
For me, looks like some kind of road map. One for 64-bit GUCs, another one
to remove
short file names from SLRUs and, to make things more complicated, [1]/messages/by-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com for
cf entry [0]https://commitfest.postgresql.org/49/5205/,
to get rid of MultiXactOffset wraparound by switching to 64 bits.

[0]: https://commitfest.postgresql.org/49/5205/
[1]: /messages/by-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com
/messages/by-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com

--
Best regards,
Maxim Orlov.

#176

Aleksander Alekseev

aleksander@timescale.com

about 1 year ago

In reply to: Maxim Orlov (#175)

Re: Add 64-bit XIDs into PostgreSQL 15

Michael, Maxim,

Apparently, the original thread will inevitably disintegrate into many separate ones.
For me, looks like some kind of road map. One for 64-bit GUCs, another one to remove
short file names from SLRUs and, to make things more complicated, [1] for cf entry [0],
to get rid of MultiXactOffset wraparound by switching to 64 bits.

[0] https://commitfest.postgresql.org/49/5205/
[1] /messages/by-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com

Agree.

To me it seems like we didn't reach a consensus regarding switching to
64-bit XIDs. Given that and the fact that the patchset is rather
difficult to rebase (and review) I suggest focusing on something we
reached a consensus for. I'm going to close a CF entry for this
particular thread as RwF unless anyone objects. We can always return
to this later, preferably knowing that there is a particular committer
who has time and energy for merging this.

--
Best regards,
Aleksander Alekseev

#177

Aleksander Alekseev

aleksander@timescale.com

about 1 year ago

In reply to: Aleksander Alekseev (#176)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

Agree.

To me it seems like we didn't reach a consensus regarding switching to
64-bit XIDs. Given that and the fact that the patchset is rather
difficult to rebase (and review) I suggest focusing on something we
reached a consensus for. I'm going to close a CF entry for this
particular thread as RwF unless anyone objects. We can always return
to this later, preferably knowing that there is a particular committer
who has time and energy for merging this.

I started a new thread and opened a new CF entry for Int64 GUCs:

https://commitfest.postgresql.org/50/5253/

--
Best regards,
Aleksander Alekseev

#178

Evgeny

evorop@gmail.com

about 1 year ago

In reply to: Aleksander Alekseev (#177)

12 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

*Hello! *

The next edition of the xid64 patch has been implemented.

In the presentedÂ version:

1) gathered patches MXIDOFFSETv9 [1], GUC64v2 [2], xid64 (v52-v55) (the
latest versions from the relevant threads have been used);

2) the resulted patches have been rebased onto the current master
(3f9b9621766);

3) implemented the HeapTupleCopyRawXidsFromPage. This function avoids
substitution of xids and returns their real values even in a case of a
frozen tuple;

4) fixed issues revealed by tests;

5) in addition, fixed the "stale xid64" problem at the
heapam_tuple_satisfies_snapshot function. This issue mostly appeared in
practice as the "missing chunk" error during operations on TOASTs of the
pg_statistic table.

*Looking forward to cooperation and review,*

Evgeny Voropaev,

TantorLabs, LLC.

<evorop@gmail.com>

<evgeny.voropaev@tantorlabs.ru>

*[1] MXIDOFFSETv9 -
/messages/by-id/CACG=ezajc_Pcqmy6fcq-N8+LzCRMzOzJzez2_BgHEu-6RVJtKQ@mail.gmail.com*

*[2] GUC64v2 -
/messages/by-id/4c0c1924-a68c-4b3f-bf81-5af13b1a9087@gmail.com*

#179

Evgeny

evorop@gmail.com

about 1 year ago

In reply to: Evgeny (#178)

12 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

The v56 patch version includes redundant additions which are not quite
relevant to the xid64. The v57 version is refined.

Best regards,
Evgeny Voropaev,
Tantor Labs, LLC.
<evorop@gmail.com>
<evgeny.voropaev@tantorlabs.ru>

#180

Evgeny

evorop@gmail.com

12 months ago

In reply to: Aleksander Alekseev (#158)

12 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hello hackers!
There are several upgrades in the xid64 patch.
1. Refactored functions HeapTupleCopyRawXidsFromPage,
HeapTupleCopyXidsFromPage - repeating part of these functions withdrawn
into the separated HeapTupleCopyXidsFromPage_Basis in order to avoid
code duplication.
2. In the HeapTupleCopyXidsFromPage_Basis, the check, whether the tuple
does not belong to the page, has been implemented. It is quite a rough
check based upon checking addresses of the page, the tuple and the
special, but it is sufficient to detect the problem.
3. Simplified the signature of the convert_page function. Rejected the
passing of the Page parameter - it is calculated inside the convert_page
function now.

Base: current master (39240bcad56, 2024-12-16).

Best regards,
Evgeny Voropaev,
Tantor Labs LLC.

Import Notes

Reply to msg id not found: 5a9f9c97-36ad-4b19-94fb-874ac7adb265@gmail.com

#181

wenhui qiu

qiuwenhuifx@gmail.com

12 months ago

In reply to: Evgeny (#179)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Evgeny
xid64 path split several threads ,The current one should be this:(
/messages/by-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com)
,We can do some tests on path so that can merge earlier

Thanks

On Tue, Dec 10, 2024 at 7:47 PM Evgeny <evorop@gmail.com> wrote:

Show quoted text

The v56 patch version includes redundant additions which are not quite
relevant to the xid64. The v57 version is refined.

Best regards,
Evgeny Voropaev,
Tantor Labs, LLC.
<evorop@gmail.com>
<evgeny.voropaev@tantorlabs.ru>

#182

Evgeny Voropaev

evgeny.voropaev@tantorlabs.com

12 months ago

In reply to: wenhui qiu (#181)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Evgeny
Â Â Â xid64 path split several threadsÂ ,The current one should be
this:(/messages/by-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com)
,We can do some tests on path so that Â can merge earlier
Thanks

Hello Wenhui!

Thank you for pointing out the specific topic developing the Multixact
ID Offset MXIDOFF64 solution. I agree to the fullest extent with
community's decision to split the xid64 into smaller parts and
subsequently merge them into the master. In the meantime, abandoning the
full version of the original patch and developing only its fragments, we
are causing the target solution to become staled. That is why I have
tried to put the whole solution together. The xid64 patch (v56-v58)
includes the MXIDOFF patch as a distinct files. The first four
patch-files of the xid64 are exactly the 9th version of the MXIDOFF64.
If new versions of the xid64 patch update the parts relevant to
MXIDOFF64, we will add them into MXIDOFF64 thread as a next version. I
also hope you and the community will support attempts to retain the
whole xid64 solution in the consistent state because only the consistent
state allows us to continually test and fix the xid64 solution.

Best regards,
Evgeny Voropaev,
Tantor Labs LLC.

#183

wenhui qiu

qiuwenhuifx@gmail.com

12 months ago

In reply to: Evgeny Voropaev (#182)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi Evgeny

we are causing the target solution to become staled

I think the community is not positive enough about xid64,There is also a
lot of resistance, such as the wal log becomes larger

I also hope you and the community will support attempts to retain the
whole xid64 solution in the consistent state because only the consistent
state allows us to continually test and fix the xid64 solution.

My point of view is the same as yours, I just kindly remind you that the
community's point of view is to split into small patches, like logical
replication ddl, which was even more popular, resulting in no news now

Thanks

On Tue, Dec 17, 2024 at 3:28 PM Evgeny Voropaev <
evgeny.voropaev@tantorlabs.com> wrote:

Show quoted text

Hi Evgeny
xid64 path split several threads ,The current one should be
this:(

/messages/by-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com)

,We can do some tests on path so that can merge earlier
Thanks

Hello Wenhui!

Thank you for pointing out the specific topic developing the Multixact
ID Offset MXIDOFF64 solution. I agree to the fullest extent with
community's decision to split the xid64 into smaller parts and
subsequently merge them into the master. In the meantime, abandoning the
full version of the original patch and developing only its fragments, we
are causing the target solution to become staled. That is why I have
tried to put the whole solution together. The xid64 patch (v56-v58)
includes the MXIDOFF patch as a distinct files. The first four
patch-files of the xid64 are exactly the 9th version of the MXIDOFF64.
If new versions of the xid64 patch update the parts relevant to
MXIDOFF64, we will add them into MXIDOFF64 thread as a next version. I
also hope you and the community will support attempts to retain the
whole xid64 solution in the consistent state because only the consistent
state allows us to continually test and fix the xid64 solution.

Best regards,
Evgeny Voropaev,
Tantor Labs LLC.

#184

Evgeny Voropaev

evgeny.voropaev@tantorlabs.com

9 months ago

In reply to: wenhui qiu (#183)

15 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hello hackers!

The actual version of the xid64 patch is attached. The patch has been
rebased onto the master branch (1eb399366ed, 2025-03-26).
This version fixes several bugs found since December 2024.
1. Fixed the error that occurred when parsing an XLog record in the
DecodeUpdate function. The error occurred when XLOG_HEAP_INIT_PAGE flag
was set.

2. Bugs related to page inconsistencies after a prune_freeze operation.
- fixed the bug of the XMAX_COMMITTED hint bit.
- fixed the error "An unsaved xmax at a freezing during a replication."

Best regards,
Evgeny Voropaev, Tantor Labs.

#185

Maxim Orlov

orlovmg@gmail.com

7 months ago

In reply to: Evgeny Voropaev (#184)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

On Wed, 26 Mar 2025 at 19:18, Evgeny Voropaev <
evgeny.voropaev@tantorlabs.com> wrote:

Hello hackers!

The actual version of the xid64 patch is attached. The patch has been
rebased onto the master branch (1eb399366ed, 2025-03-26).

Thank you for the rebase! I appreciate you wanting to help. In fact, I
update
this patch almost every month, and I am aware of the aforementioned flaws
(as
well as a few others). I don't post updates to the mailing list because I
intend
to rebuild this patch, as some aspects of it are no longer relevant. If all
goes
well, I'll complete it in the coming weeks. Unfortunately, it is now
evident to
me that the patch in its current form cannot be committed. The only way to
go is
to do it in an increment way. Step by step. As previously said, it is best
to
focus on [0]https://commitfest.postgresql.org/patch/5205/, which introduces 64-bit mxoff.

However, as the primary developer of this patch set, I respectfully request
that
you refrain from refactoring the code here, as this hampers future rebases,
given the patch's size and complexity. We'll gladly take any refactoring
recommendations when and if the patch is ready to commit.

However, if you discover any bugs, I advise you to report them in this
thread
first and, if you can, include a reproduction. It would be extremely
helpful.
Otherwise, we risk ending up with the mess here.

As for the bug-fixes part.

1) Yes, heapam_tuple_satisfies_snapshot needs to re-copy XIDs from the page
for
a tuple, and as an ad-hoc fix, this will work, but a proper fix should
consider all cases (present and foreseeable in the future); thus,
HeapTupleSatisfies* functions should do this. This is not easy because
we
must know if this particular tuple is from the TOAST relation or not.
Straightforward throwing through a function argument would be a very
ugly
solution. I'll try to address these issues in the next iteration.

2) Fix for DecodeUpdate is bogus. The datalen is not calculated properly.

3) Fix "Removed the management of repairing fragmentation". I don't really
get
it. In my rebase, it works as it should, and I don't understand why it
may be
removed.

Overall, it looks like there is a problem with this rebase, unfortunately.
PFA rebased @ dbf42b84ac patch set.

[0]: https://commitfest.postgresql.org/patch/5205/

--
Best regards,
Maxim Orlov.

#186

Maxim Orlov

orlovmg@gmail.com

7 months ago

In reply to: Maxim Orlov (#185)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Here is a new rebase @ 70a13c528b6e382a381f that includes solutions for the
previously mentioned issues as well as
a few more fixes and tweaks.

I decided to eliminate the TOAST flag for the function's parameters and
instead check pg_special for it.

As a result of 15a79c73111f0c9 and a0ed19e0a9efe9, we also have to utilise
the PRI* format string. At the moment, I'm
not sure how to integrate this into the existing patch set. I simply added
it on top of the existing ones. The only possibility
appears to be to return to the previous version in which the primary patch
of XIDs and the patch changing the message
format to 64 bits were combined.

--
Best regards,
Maxim Orlov.

#187

Evgeny Voropaev

evorop.wiki@gmail.com

6 months ago

In reply to: Maxim Orlov (#186)

Re: Add 64-bit XIDs into PostgreSQL 15

Glad to see the activity around the full version of xid64!

1) About extra freezing in `freeze_single_heap_page`.

At xid64v62 `freeze_single_heap_page` performs freezing via a call to
`heap_page_prune_and_freeze`.
Why, then, does `freeze_single_heap_page` perform freezing a second time?

```
freeze_single_heap_page(Relation relation, Buffer buffer)
{
...
// First freezing
heap_page_prune_and_freeze(relation, buffer, vistest, 0, NULL, &presult,
PRUNE_ON_ACCESS, &dummy_off_loc, NULL, NULL, false);
...
// Preparation for Extra freezing
for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum =
OffsetNumberNext(offnum))
{
...
heap_prepare_freeze_tuple(&tuple, &cutoffs, &pagefrz,
&frozen[nfrozen], &totally_frozen)
...
}
...
// Extra freezing
heap_freeze_prepared_tuples(relation, buffer, frozen, nfrozen);
...
}
```

Are the actions of `heap_page_prune_and_freeze` not sufficient for freezing?
If this extra freezing step is redundant, we might want to consider
removing this duplicated freezing.

2) About repairing fragmentation.

The original approach implemented in PG18 assumes that fragmentation
occurs during every `prune_freeze` operation. It happens because the
logic of the "redo"-function `heap_xlog_prune_freeze` assumes that
fragmentation has to be done by `heap_page_prune_execute`. Attempting to
omit fragmentation can result in page inconsistencies on the "redo"-side
(i.e. on a secondary node, or during the recovery process on primary
one). So, implementation of optional repairing of fragmentation
conflicts with the basic assumption about "necessity of fragmentation".
In order to prevent inconsistency xid64v62 patch invokes
`heap_page_prune_and_freeze` with `repairFragmentation` equal to true
from everywhere in the patch code except from
`heap_page_prepare_for_xid` which uses `repairFragmentation=false`.

So, why must we perform a `heap_page_prune_execute` without a
fragmentation during the preparation of a page for xid?

What exactly would break if we did perform `heap_page_prune_execute`
with NO fragmentation repair during `heap_page_prepare_for_xid`?

Best regards,
Evgeny Voropaev,
Tantor Labs, LLC.

#188

Evgeny Voropaev

evorop.wiki@gmail.com

6 months ago

In reply to: Evgeny Voropaev (#187)

Re: Add 64-bit XIDs into PostgreSQL 15

What exactly would break if we did perform `heap_page_prune_execute`
with NO fragmentation repair during `heap_page_prepare_for_xid`?

Correction into the last question:

What exactly would break if we did invoke `heap_page_prune_execute` with
`repairFragmentation=true` during performing of `heap_page_prepare_for_xid`?

#189

Yura Sokolov

y.sokolov@postgrespro.ru

6 months ago

In reply to: Evgeny Voropaev (#188)

Re: Add 64-bit XIDs into PostgreSQL 15

11.06.2025 09:00, Evgeny Voropaev wrote:

2) About repairing fragmentation.

The original approach implemented in PG18 assumes that fragmentation
occurs during every `prune_freeze` operation. It happens because the
logic of the "redo"-function `heap_xlog_prune_freeze` assumes that
fragmentation has to be done by `heap_page_prune_execute`.

Attempting to
omit fragmentation can result in page inconsistencies on the "redo"-side
(i.e. on a secondary node, or during the recovery process on primary
one).

No! Because patch uses flag in WAL record to instruct "redo"-side to omit
fragmentation as well if needed.

So, implementation of optional repairing of fragmentation
conflicts with the basic assumption about "necessity of fragmentation".
In order to prevent inconsistency xid64v62 patch invokes
`heap_page_prune_and_freeze` with `repairFragmentation` equal to true
from everywhere in the patch code except from
`heap_page_prepare_for_xid` which uses `repairFragmentation=false`.

So, why must we perform a `heap_page_prune_execute` without a
fragmentation during the preparation of a page for xid?

What exactly would break if we did invoke `heap_page_prune_execute` with
`repairFragmentation=true` during performing of `heap_page_prepare_for_xid`?

Short answer:
- `repairFragmentation` parameter were added after investigating real
production issues with earlier patch versions.

Long answer:

How SELECT works with tuples on a page?
It:
- PINS the page
- takes CONTENT LOCK in SHARED mode
- collects HeapTuples which LOOKS INTO RAW PAGE with t_data.t_choice.t_heap
- RELEASES content lock
- may use those HeapTuples for indefinitely long time relying only on PIN
of the page.

I.e. SELECT relies on the fact, while a page is pinned, tuples on the page
stay at the same positions in memory.

That is why LockBufferForCleanup and ConditionalLockBufferForCleanup checks
there is only single PIN on the page - only backend which will perform
cleanup is allowed to PIN the page.

UPDATE/INSERT/DELETE lock CONTENT LOCK in EXCLUSIVE mode because they may
add new tuples. But they are not allowed to move tuples because concurrent
backends allowed to read tuples from the page in exactly same moment.

--
regards
Yura Sokolov aka funny-falcon

#190

Maxim Orlov

orlovmg@gmail.com

5 months ago

In reply to: Yura Sokolov (#189)

8 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Here is a new rebase @ 40a96cd148

Add a bug fix for a false snapshot horizon conflict.

--
Best regards,
Maxim Orlov.

#191

Evgeny Voropaev

evgeny.voropaev@tantorlabs.com

5 months ago

In reply to: Yura Sokolov (#189)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Yuriy, thank you for the explanation!

But they are not allowed to move tuples because concurrent
backends allowed to read tuples from the page in exactly same moment.

Now I understand that we cannot repair fragmentation if we donât have a
buffer cleanup lock on the page.

Nevertheless, using the heap_page_prune_and_freeze function without
repairing fragmentation still leads to the inconsistency problem, see below.

No! Because patch uses flag in WAL record to instruct "redo"-side to omit
fragmentation as well if needed.

Shortly, using the XLHP_REPAIR_FRAGMENTATION flag equal to `false` is
the direct road to page inconsistency.

Explaining thoroughly, in the case of omitting fragmentation we can end
up with an inconsistent page state on "redo"-side. Please, see the
attachment #1, depicting further explanation. Inconsistency can occur
because of a difference between functions heap_page_prune_and_freeze
("do"-side) and heap_xlog_prune_freeze ("redo"-side). The difference is
that the former function performs heap_prune_satisfies_vacuum before
pruning, and the latter one does not perform it. It can result in the
next situation:
DO-side:
heap_page_prune_and_freeze =>
1) heap_prune_satisfies_vacuum => HEAP_XMAX_COMMITTED=1 for
some tuples
2) heap_page_prune_execute => prune some tuples
3) Do not repair fragmentation
Result: garbage on the page comprising of pruned tuples having
HEAP_XMAX_COMMITTED=1

REDO-side:
heap_xlog_prune_freeze =>
2) heap_page_prune_execute => prune tuple/tuples
3) Do not repair fragmentation
Result: garbage with pruned tuples having HEAP_XMAX_COMMITTED=0

And that difference in the HEAP_XMAX_COMMITED is the inconsistency, even
when it is in garbage on a page. And, probably, it is not the only
example of the problem. I discovered this situation at xid64v58 since I
had invoked heap_page_prune_and_freeze(repairFragmentation=FALSE)` from
everywhere I used it.

In regard to the xid64v61-v63, it invokes
`heap_page_prune_and_freeze(repairFragmentation=TRUE)` from all places
of code. But the patch still invokes
`heap_page_prune_and_freeze(repairFragmentation=FALSE)` from the
`freeze_single_heap_page`. And we can potentially lead a page to
inconsistency here. If we cannot, please, tell us why we cannot.

Unfortunately, I have not made a test revealing this problem.

#192

Maxim Orlov

orlovmg@gmail.com

5 months ago

In reply to: Evgeny Voropaev (#191)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, 1 Jul 2025 at 14:08, Evgeny Voropaev <evgeny.voropaev@tantorlabs.com>
wrote:

And that difference in the HEAP_XMAX_COMMITED is the inconsistency, even
when it is in garbage on a page. And, probably, it is not the only
example of the problem.

I can't imagine that this action would cause any real issues. If you
know of any real problems, please tell me. Although I'm not
particularly against the idea of zeroing tuples, I don't see much
point in it, to be honest. But to ensure everyone's peace of mind,
I've made yet another commit that accomplishes this.

--
Best regards,
Maxim Orlov.

#193

Andrey Borodin

x4mmm@yandex-team.ru

5 months ago

In reply to: Maxim Orlov (#192)

Re: Add 64-bit XIDs into PostgreSQL 15

On 2 Jul 2025, at 18:38, Maxim Orlov <orlovmg@gmail.com> wrote:

If you
know of any real problems, please tell me

If I understood correctly, pages can differ on primary and standby. That might be problematic for WAL debug tests.

Best regards, Andrey Borodin.

#194

Yura Sokolov

y.sokolov@postgrespro.ru

5 months ago

In reply to: Andrey Borodin (#193)

Re: Add 64-bit XIDs into PostgreSQL 15

04.07.2025 11:31, Andrey Borodin wrote:

On 2 Jul 2025, at 18:38, Maxim Orlov <orlovmg@gmail.com> wrote:

If you
know of any real problems, please tell me

If I understood correctly, pages can differ on primary and standby. That might be problematic for WAL debug tests.

They may differ. They differ even now. It is normal.

That is why when you enable "wal_consistency_checking" (developer option,
not present in sample postgresql.conf), pages are masked: allowed to be
different pieces are zeroed (pages hint bits, tuple hint bits, space
between pd_lower and pd_upper).

Before last version, empty space between tuples with
"repair_fragmentation=false" was certainly possible source of difference
not masked by standard mask procedure. But now those holes are zeroed.

--
regards
Yura Sokolov aka funny-falcon

#195

Evgeny Voropaev

evgeny.voropaev@tantorlabs.com

5 months ago

In reply to: Yura Sokolov (#194)

2 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hello, hackers!

Unfortunately, the problem of inconsistency while using prune_frezze
with repairFragmentation=false does not only pertain to the content of
dead and unused tuples, but it also can bring about inconsistency of
locations of alive tuples.

This case appears in the logic of heap_insert. See the attached figure.
When heap_insert determines that a new tuple is the only one on a page,
it sets the XLOG_HEAP_INIT_PAGE and, as a result, âredoâ-side
initializes the new page and inserts the new tuple on this new page
instead of inserting the new tuple on the existing page.

So, we have the next situation in the xid64 patch.

Do-side:
1. Having page ABC with several tuples.
2. Starting to perform insertion of new tuple
2.1. In the case of an inappropriate xid_base, trying to fit base
2.1.1 Freezing and pruning tuples without further repairing
fragmentation.
2.1.2 All tuples have been pruned (no alive tuples on the page
since this moment)
3. Inserting a new tuple and setting XLOG_HEAP_INIT_PAGE, assuming that
the only tuple located at the bottom of the page (assuming that
fragmentation has been performed).

Result: We have the ABC page with the new tuple inserted somewhere in
the MIDDLE of the page and surrounded with garbage from dead and unused
tuples. At the same time we have an xlog record bringing the
XLOG_HEAP_INIT_PAGE bit.

Redo-side
1. Observing XLOG_HEAP_INIT_PAGE
2. Creating a new page and inserting the new tuple into the first
position of the page.

Result: We have the ABC page with the new tuple inserted at the BOTTOM
of the page.

This example of inconsistency is not about the content of the tuple but
about tupleâs locations on the page. And tuple offsets are not subject
to masking by the standard masking procedure.

The possible fix can be like one in attachment. But what Iâm trying to
suggest is adhering to the original realization of PG, performing
prune_freeze only under a buffer cleanup lock, and fully excluding
repairFragmentation=false as a vice!

Best regards,
Evgeny Voropaev,
Tantor Labs, LLC.

#196

Yura Sokolov

y.sokolov@postgrespro.ru

5 months ago

In reply to: Evgeny Voropaev (#195)

Re: Add 64-bit XIDs into PostgreSQL 15

07.07.2025 11:17, Evgeny Voropaev wrote:

Hello, hackers!

Unfortunately, the problem of inconsistency while using prune_frezze
with repairFragmentation=false does not only pertain to the content of
dead and unused tuples, but it also can bring about inconsistency of
locations of alive tuples.

This case appears in the logic of heap_insert. See the attached figure.
When heap_insert determines that a new tuple is the only one on a page,
it sets the XLOG_HEAP_INIT_PAGE and, as a result, âredoâ-side
initializes the new page and inserts the new tuple on this new page
instead of inserting the new tuple on the existing page.

So, we have the next situation in the xid64 patch.

Do-side:
1. Having page ABC with several tuples.
2. Starting to perform insertion of new tuple
2.1. In the case of an inappropriate xid_base, trying to fit base
2.1.1 Freezing and pruning tuples without further repairing
fragmentation.
2.1.2 All tuples have been pruned (no alive tuples on the page
since this moment)
3. Inserting a new tuple and setting XLOG_HEAP_INIT_PAGE, assuming that
the only tuple located at the bottom of the page (assuming that
fragmentation has been performed).

Result: We have the ABC page with the new tuple inserted somewhere in
the MIDDLE of the page and surrounded with garbage from dead and unused
tuples. At the same time we have an xlog record bringing the
XLOG_HEAP_INIT_PAGE bit.

Redo-side
1. Observing XLOG_HEAP_INIT_PAGE
2. Creating a new page and inserting the new tuple into the first
position of the page.

Result: We have the ABC page with the new tuple inserted at the BOTTOM
of the page.

This example of inconsistency is not about the content of the tuple but
about tupleâs locations on the page. And tuple offsets are not subject
to masking by the standard masking procedure.

Wow, it is really great bug.

The possible fix can be like one in attachment. But what Iâm trying to
suggest is adhering to the original realization of PG, performing
prune_freeze only under a buffer cleanup lock, and fully excluding
repairFragmentation=false as a vice!

Not setting XLOG_HEAP_INIT_PAGE, like in your patch, looks like the right
way to me.

"only under a buffer cleanup lock" means no any backend may have a pin on
this page. This greatly increase rate of failure on update or delete of
tuple in this page. For insert there could be workaround, but still it will
be quite trivial.

Therefore, "fully excluding repairFragmentation=false" is not wise.

But... probably we could try to acquire "cleanup lock" to perform
prune_freeze with "repairFragmentation=true", and fallback to
"repairFragmentation=false" in case of acquire failure.

Still, I repeat, "excluding repairFragmentation=false" completely doesn't
look feasible to me.

--
regards
Yura Sokolov aka funny-falcon

#197

Maxim Orlov

orlovmg@gmail.com

5 months ago

In reply to: Yura Sokolov (#196)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Rebase @ f5a987c0e5f6bbf0
Main conflicts were:
62a17a9283 Integrate FullTransactionIds deeper into two-phase code
2633dae2e4 Standardize LSN formatting by zero padding

Also, no setting XLOG_HEAP_INIT_PAGE on empty page as discussed above added.

On Mon, 7 Jul 2025 at 12:49, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

Wow, it is really great bug.

It's funny that no one has actually encountered this problem in all these
years. It must
be a very rare mix of circumstances, I think.

Thank you very much, everyone, for participating!

--
Best regards,
Maxim Orlov.

#198

Maksim.Melnikov

m.melnikov@postgrespro.ru

5 months ago

In reply to: Evgeny Voropaev (#195)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi all,

On 07.07.2025 11:17, Evgeny Voropaev wrote:

Do-side:
1. Having page ABC with several tuples.
2. Starting to perform insertion of new tuple
   2.1. In the case of an inappropriate xid_base, trying to fit base
   2.1.1 Freezing and pruning tuples without further repairing
fragmentation.
    2.1.2 All tuples have been pruned (no alive tuples on the page
since this moment)
3. Inserting a new tuple and setting XLOG_HEAP_INIT_PAGE, assuming
that the only tuple located at the bottom of the page (assuming that
fragmentation has been performed).

Result: We have the ABC page with the new tuple inserted somewhere in
the MIDDLE of the page and surrounded with garbage from dead and
unused tuples. At the same time we have an xlog record bringing the
XLOG_HEAP_INIT_PAGE bit.

Redo-side
1. Observing XLOG_HEAP_INIT_PAGE
2. Creating a new page and inserting the new tuple into the first
position of the page.

Evgeny, I've tried to reproduce described scenario, but I can't get
XLOG_HEAP_INIT_PAGE bit set.

Short steps description.

1. create test table, insert tuple, then exec few update operations,
then exec delete operation,
all tuples placed on one page and "select * from test_table;" return 0
rows.
2. stop postgres and use pg_resetwal to promote xid. This will help us
to prune tuples.
3. start postgres and insert new tuple on page, old tuples pruning
occurs and here we expect
XLOG_HEAP_INIT_PAGE bit set in xlog rec. As I understand, it is the
reason of further tuple location
inconsistency.
But as I see, XLOG_HEAP_INIT_PAGE bit isn't set. The reason is condition

        if (ItemPointerGetOffsetNumber(&(heaptup->t_self)) ==
FirstOffsetNumber &&
            PageGetMaxOffsetNumber(page) == FirstOffsetNumber)
        {
            info |= XLOG_HEAP_INIT_PAGE;
            bufflags |= REGBUF_WILL_INIT;
        }
As I see, new heaptup offset isn't equal to FirstOffsetNumber and it is
quite enough to skip bit set.
I've used waldump to check bit set

rmgr: Heap len (rec/tot): 75/ 75, tx: 2147485000, lsn:
0/02001E90, prev 0/02001E50, desc: INSERT off: 4, flags: 0x00, blkref
#0: rel 1663/5/16431 blk 0

, so offset is 4 and desc: INSERT, in case XLOG_HEAP_INIT_PAGE is set,
it should be INSERT+INIT.
In attached file you can find detailed test description with results, so
you can repeat it.

Maybe I missed something in your scenario, can you please clarify?
Or maybe it is't reproducable at all.

Best regards,
Maksim Melnikov

#199

Daniil Davydov

3danissimo@gmail.com

5 months ago

In reply to: Maksim.Melnikov (#198)

Re: Add 64-bit XIDs into PostgreSQL 15

Hi,

On Thu, Jul 17, 2025 at 8:36 PM Maksim.Melnikov
<m.melnikov@postgrespro.ru> wrote:

On 07.07.2025 11:17, Evgeny Voropaev wrote:

Do-side:
1. Having page ABC with several tuples.
2. Starting to perform insertion of new tuple
2.1. In the case of an inappropriate xid_base, trying to fit base
2.1.1 Freezing and pruning tuples without further repairing
fragmentation.
2.1.2 All tuples have been pruned (no alive tuples on the page
since this moment)
3. Inserting a new tuple and setting XLOG_HEAP_INIT_PAGE, assuming
that the only tuple located at the bottom of the page (assuming that
fragmentation has been performed).

Result: We have the ABC page with the new tuple inserted somewhere in
the MIDDLE of the page and surrounded with garbage from dead and
unused tuples. At the same time we have an xlog record bringing the
XLOG_HEAP_INIT_PAGE bit.

Redo-side
1. Observing XLOG_HEAP_INIT_PAGE
2. Creating a new page and inserting the new tuple into the first
position of the page.

Evgeny, I've tried to reproduce described scenario, but I can't get
XLOG_HEAP_INIT_PAGE bit set.

In attached file you can find detailed test description with results, so
you can repeat it.

Maybe I missed something in your scenario, can you please clarify?
Or maybe it is't reproducable at all.

I have also tried to reproduce this bug. To do this, we need several
conditions:
1) New tuple must be "linked" with the first line pointer on page
2) At the time the new tuple is inserted, there should be another
tuple (A) on the page.
3) Freeze&prune must be able to truncate line pointer of the
tuple (A)

We can easily reach (1), (2) conditions :
1. Insert two tuples;
2. Delete first of them;
4. Call vacuum on table
5. Delete second tuple
6. Advance xid into the future (for example, via pg_resetwal)
6. Insert new tuple

Thus, the new tuple will be associated with the first line pointer,
and the page will have one dead tuple (associated with the second
line pointer).

But as far as I understand, (3) condition is unreachable, because we
just cannot truncate the line pointer of the second tuple (and prune
tuple itself).

Possibly, it could have happened inside freeze_single_heap_page, but
this function cannot mark the line pointer of the second tuple as LP_UNUSED
(we don't specify HEAP_PAGE_PRUNE_MARK_UNUSED_NOW flag for
heap_page_prune_and_freeze call).

As a result, the page will always have more than one line pointer.
Interference of parallel operations is excluded because we have an
exclusive lock on buffer during heap_page_prepare_for_xid call.

Maybe we have operations that might mark line pointers as unused
without repairing fragmentation (it might be useful for reproduction of
this bug), but I didn't find any.

BTW, I think that "phdr->pd_special == phdr->pd_upper" is the best way
to decide whether we need to set the XLOG_HEAP_INIT_PAGE
flag to WAL record (at least this would eliminate the occurrence of the
described problem).

--
Best regards,
Daniil Davydov

#200

Yura Sokolov

y.sokolov@postgrespro.ru

5 months ago

In reply to: Daniil Davydov (#199)

Re: Add 64-bit XIDs into PostgreSQL 15

21.07.2025 12:28, Daniil Davydov пишет:

Hi,

On Thu, Jul 17, 2025 at 8:36 PM Maksim.Melnikov
<m.melnikov@postgrespro.ru> wrote:

On 07.07.2025 11:17, Evgeny Voropaev wrote:

Do-side:
1. Having page ABC with several tuples.
2. Starting to perform insertion of new tuple
2.1. In the case of an inappropriate xid_base, trying to fit base
2.1.1 Freezing and pruning tuples without further repairing
fragmentation.
2.1.2 All tuples have been pruned (no alive tuples on the page
since this moment)
3. Inserting a new tuple and setting XLOG_HEAP_INIT_PAGE, assuming
that the only tuple located at the bottom of the page (assuming that
fragmentation has been performed).

Result: We have the ABC page with the new tuple inserted somewhere in
the MIDDLE of the page and surrounded with garbage from dead and
unused tuples. At the same time we have an xlog record bringing the
XLOG_HEAP_INIT_PAGE bit.

Redo-side
1. Observing XLOG_HEAP_INIT_PAGE
2. Creating a new page and inserting the new tuple into the first
position of the page.

Evgeny, I've tried to reproduce described scenario, but I can't get
XLOG_HEAP_INIT_PAGE bit set.

In attached file you can find detailed test description with results, so
you can repeat it.

Maybe I missed something in your scenario, can you please clarify?
Or maybe it is't reproducable at all.

I have also tried to reproduce this bug. To do this, we need several
conditions:
1) New tuple must be "linked" with the first line pointer on page

This is not quite right.
Empty space large enough for new tuple may lead page to be selected for
insertion.

Yet it have to pass "prunning" on page read-access path. So either empty
space need to be even larger (to not trigger prunning), or there should be
tuple interested for transaction alive at the moment of page read-access,
but finished at the moment of heap_insert execution.

It looks to be quite complex condition, so it should be extremely rare in
production. But still possible.

More over, it is hard for the bug to trigger real problems, because
INSERT/UPDATE/DELETE uses tuple numbers to describe operation, not offsets.
So operations are executed correctly... unless there were switch of master,
and replica did more insertions into same page, so it overflowed free space
on previous master, and full page write were disabled.

So, the bug is real. But looks like it is quite minor.

(Disclaimer: I didn't analyze code again to write text above, so I could be
mistaken somewhere).

--
regards
Yura Sokolov aka funny-falcon

#201

Evgeny Voropaev

evgeny.voropaev@tantorlabs.com

4 months ago

In reply to: Yura Sokolov (#200)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Hello hackers!

I still have not created a TAP-test allowing for narrowing the problem
of page inconsistency with a WAL-redo-Heap/INSERT+INIT operation. But
the next script allows us to replay the problem during more common and
more general tests.

1. Do not apply a fix for this problem or use the xid64 patch of the v64
version or less.

2. Use the `Emulating page inconsistency at a WAL-redo-Heap/INSERT+INIT
operation` patch, which implements `repairFragmentation=false` at the
invocation of `heap_page_prune_and_freeze` from the
`heap_page_prune_opt` function.
It is quite contrived code, but it allows for replaying the problem.

3. Start the 027_stream_regress.pl test with the next command:
make -C "./src/test/recovery/" check
PROVE_TESTS="t/027_stream_regress.pl"
PG_TEST_EXTRA="wal_consistency_checking"

As a result, you should see
```
2025-08-05 14:15:38.196 +08 startup[173535] FATAL: inconsistent page
found, rel 1663/16384/1259, forknum 0, blkno 0
2025-08-05 14:15:38.196 +08 startup[173535] CONTEXT: WAL redo at
0/3490EF0 for Heap/INSERT+INIT: off: 0, flags: 0x00; blkref #0: rel
1663/16384/1259, blk 0 FPW
```

I am convinced that a sequence is next:
Step 1. Something (in the case of a presented contrived code, it is the
`heap_page_prune_opt` function) prunes the page with
repairFragmentation=false, kills all tuples and does not repair
fragmentation of the pruned page.
Step 2. Inserting is performed to this empty, but not repaired page.
Result. The problem of page inconsistency at a WAL-redo-Heap/INSERT+INIT
operation.

In this scenario we do not even need to invoke `prepare_page_for_xid` or
`freeze_single_heap_page`.

Once again, it is quite contrived code, and I do not know if it is
possible that the real xid64-patch code once will perform pruning with
repairFragmentation=false, and subsequently will perform
Heap/INSERT+INIT. But maybe this contrived code will help us to replay
the problem and to create specific TAP-test.

Best regards,
Evgeny Voropaev.

#202

Maxim Orlov

orlovmg@gmail.com

4 months ago

In reply to: Maxim Orlov (#197)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Rebase @ 53eff471c69dc8b0c01

I revised parts of heap_convert.c's conversion from 32-bit to
64-bit pages.

Previously, we used a deferred full-page write to the converted pages
to reduce the impact on WAL. Leaving some converted pages in RAM
memory. This, however, may cause the incoming autovacuum (AV) to clean
transaction statuses for some pages depending on their XIDs. As a
result, we risk losing transaction statuses and encountering problems
like "could not open file pg_xact/...".

So I decided to abandon the delayed FPW of converted pages, as this
write is purely a one-time issue on the first read of the data after
the 64-bit upgrade.

On Tue, 15 Jul 2025 at 09:23, Maxim Orlov <orlovmg@gmail.com> wrote:

Rebase @ f5a987c0e5f6bbf0
Main conflicts were:
62a17a9283 Integrate FullTransactionIds deeper into two-phase code
2633dae2e4 Standardize LSN formatting by zero padding

Also, no setting XLOG_HEAP_INIT_PAGE on empty page as discussed above
added.

On Mon, 7 Jul 2025 at 12:49, Yura Sokolov <y.sokolov@postgrespro.ru>
wrote:

Wow, it is really great bug.

It's funny that no one has actually encountered this problem in all these
years. It must
be a very rare mix of circumstances, I think.

Thank you very much, everyone, for participating!

--
Best regards,
Maxim Orlov.

--
Best regards,
Maxim Orlov.

#203

Maxim Orlov

orlovmg@gmail.com

17 days ago

In reply to: Maxim Orlov (#202)

1 attachment(s)

Re: Add 64-bit XIDs into PostgreSQL 15

Here is a rebase @ 4d3f623ea88

The patch set has grown increasingly old. It appears to me that
everything requires to be thoroughly overhauled to bring it up to date.
Especially given that the first patch to be committed switches
multi-offsets to 64-bit and utilizes a different pg_upgrade method.

I am planning to tackle this in the near future and give the patch a
thorough overhaul.
--
Best regards,
Maxim Orlov.

#204

wenhui qiu

qiuwenhuifx@gmail.com

17 days ago

In reply to: Maxim Orlov (#203)

Re: Add 64-bit XIDs into PostgreSQL 15

hi Maxim
I saw (make mxidoff 64 bits) path need more guys review and pg_resetwal
need support mxidoff 64,do we need to invite more to review ?

Thanks

On Tue, Nov 25, 2025 at 2:12 PM Maxim Orlov <orlovmg@gmail.com> wrote:

Show quoted text

Here is a rebase @ 4d3f623ea88

The patch set has grown increasingly old. It appears to me that
everything requires to be thoroughly overhauled to bring it up to date.
Especially given that the first patch to be committed switches
multi-offsets to 64-bit and utilizes a different pg_upgrade method.

I am planning to tackle this in the near future and give the patch a
thorough overhaul.
--
Best regards,
Maxim Orlov.

#205

Maxim Orlov

orlovmg@gmail.com

17 days ago

In reply to: wenhui qiu (#204)

Re: Add 64-bit XIDs into PostgreSQL 15

On Tue, 25 Nov 2025 at 10:03, wenhui qiu <qiuwenhuifx@gmail.com> wrote:

hi Maxim
I saw (make mxidoff 64 bits) path need more guys review and
pg_resetwal need support mxidoff 64,do we need to invite more to review ?

Thank you for your interest in this topic. Actually, I was talking

about thread [0]https://commitfest.postgresql.org/patch/5205/. When it will be committed (I hope!), thus thread
have to be reworked accordingly.

[0]: https://commitfest.postgresql.org/patch/5205/

--
Best regards,
Maxim Orlov.

Participants

Thread Outline

Add 64-bit XIDs into PostgreSQL 15

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: