[bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Started by Craig Ringerover 9 years ago9 messageshackers
Jump to latest
#1Craig Ringer
craig@2ndquadrant.com

Hi all

Today I ran into an issue where commit timestamp lookups were failing with

ERROR: cannot retrieve commit timestamp for transaction 2

which is of course FrozenTransactionId.

TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
which I think is wrong. Attached is a patch to make it return 0 for
FrozenTransactionId and BootstrapTransactionId, like it does for xids
that are too old.

Note that the prior behaviour was as designed and has tests to enforce
it. I just think it's wrong, and it's also not documented.

IMO this should be back-patched to 9.6 and, without the TAP test part, to 9.5.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Treat-frozen-and-bootstrap-xids-as-old-not-invalid-f.patchtext/x-patch; charset=US-ASCII; name=0001-Treat-frozen-and-bootstrap-xids-as-old-not-invalid-f.patchDownload+23-15
#2Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#1)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

On 23 November 2016 at 20:58, Craig Ringer <craig@2ndquadrant.com> wrote:

Hi all

Today I ran into an issue where commit timestamp lookups were failing with

ERROR: cannot retrieve commit timestamp for transaction 2

which is of course FrozenTransactionId.

TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
which I think is wrong. Attached is a patch to make it return 0 for
FrozenTransactionId and BootstrapTransactionId, like it does for xids
that are too old.

Note that the prior behaviour was as designed and has tests to enforce
it. I just think it's wrong, and it's also not documented.

IMO this should be back-patched to 9.6 and, without the TAP test part, to 9.5.

Updated to correct the other expected file, since there's an alternate.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Treat-frozen-and-bootstrap-xids-as-old-not-invalid-f.patchtext/x-patch; charset=US-ASCII; name=0001-Treat-frozen-and-bootstrap-xids-as-old-not-invalid-f.patchDownload+23-15
#3Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#1)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Hi,

On 2016-11-23 20:58:22 +0800, Craig Ringer wrote:

Today I ran into an issue where commit timestamp lookups were failing with

ERROR: cannot retrieve commit timestamp for transaction 2

which is of course FrozenTransactionId.

TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
which I think is wrong. Attached is a patch to make it return 0 for
FrozenTransactionId and BootstrapTransactionId, like it does for xids
that are too old.

Why? It seems quite correct to not allow lookups for special case
values, as it seems sensible to give them special treatmeant at the call
site?

IMO this should be back-patched to 9.6 and, without the TAP test part,
to 9.5.

Why would we want to backpatch a behaviour change, where arguments for
the current and proposed behaviour exists?

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#3)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

On 24 November 2016 at 02:32, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2016-11-23 20:58:22 +0800, Craig Ringer wrote:

Today I ran into an issue where commit timestamp lookups were failing with

ERROR: cannot retrieve commit timestamp for transaction 2

which is of course FrozenTransactionId.

TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
which I think is wrong. Attached is a patch to make it return 0 for
FrozenTransactionId and BootstrapTransactionId, like it does for xids
that are too old.

Why? It seems quite correct to not allow lookups for special case
values, as it seems sensible to give them special treatmeant at the call
site?

It's surprising behaviour that doesn't make sense. Look at it this way:

- We do some work, generating rows that have commit timestamps
- TransactionIdGetCommitTsData() on those rows returns their cts fine
- The commit timestamp data ages out
- TransactionIdGetCommitTsData() returns 0 on these rows
- vacuum comes alone and freezes the rows, even though nothing's changed
- TransactionIdGetCommitTsData() suddenly ERRORs

Nothing has meaningfully changed on these rows. They have gone from
"old, committed, past the commit timestamp threshold" to "old,
commited, past the commit timestamp threshold, frozen".

It makes no sense to ERROR when vacuum gets around to freezing the
tuples, when we don't also ERROR when we pass the cts threshold.

ERRORing on BootstrapTransactionId is slightly more reasonable since
those rows can never have had a cts in the first place, but it's also
unnecessary since they're effectively "oldest always-committed xids".

Making it ERROR on FrozenTransactionId was a mistake and should be corrected.

IMO this should be back-patched to 9.6 and, without the TAP test part,
to 9.5.

Why would we want to backpatch a behaviour change, where arguments for
the current and proposed behaviour exists?

I don't think it's crucial since callers can just work around it, but
IMO the current behaviour is a design oversight that should be
corrected and can be safely and sensibly corrected. Nobody's going to
rely on FrozenTransactionId ERRORing.

I don't think a backpatch is crucial though; as you note, C-level
callers can work around the problem pretty simply, and that's just
what I've done in pglogical for existing versions. I just think it's
ugly, should be fixed, and is safe to fix.

It's slightly harder for SQL-level callers to work around since they
must hardcode a CASE that tests for xmin = XID '1' OR xmin = XID '2',
and it's much less reasonable to expect SQL level callers to deal with
this sort of mess with low level state.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Craig Ringer (#4)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

I considered the argument here for a bit and I think Craig is right --
FrozenXid eventually makes it to a tuple's xmin where it becomes a burden
to the caller, making our interface bug-prone -- sure you can
special-case it, but you don't until it first happens ... and it may not
until you're deep into production.

Even the code comment is confused: "error if the given Xid doesn't
normally commit". But surely FrozenXid *does* commit in the sense that
it appears in committed tuples' Xmin.

We already have a good mechanism for replying to the query with "this
value is too old for us to have its commit TS", which is a false return
value. We should use that.

I think not backpatching is worse, because then users have to be aware
that they need to handle the FrozenXid case specially, but only on
9.5/9.6 ... I think the reason it took this long to pop up is because
it has taken this long to get to replication systems on which this issue
matters.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#5)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I considered the argument here for a bit and I think Craig is right --

FWIW, I agree. We shouldn't require every call site to special-case this,
and we definitely don't want it to require special cases in SQL code.

(And I'm for back-patching, too.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I considered the argument here for a bit and I think Craig is right --

FWIW, I agree. We shouldn't require every call site to special-case this,
and we definitely don't want it to require special cases in SQL code.

(And I'm for back-patching, too.)

Pushed.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Craig Ringer (#2)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Craig Ringer wrote:

Updated to correct the other expected file, since there's an alternate.

FWIW I don't know what you did here, but you did not patch the
alternate expected file.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Craig Ringer
craig@2ndquadrant.com
In reply to: Alvaro Herrera (#8)
Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

On 25 November 2016 at 02:44, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Craig Ringer wrote:

Updated to correct the other expected file, since there's an alternate.

FWIW I don't know what you did here, but you did not patch the
alternate expected file.

Damn. Attached the first patch a second time is what I did.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers