pgsql: Change timeline field of IDENTIFY_SYSTEM to int8

Started by Peter Eisentrautover 3 years ago7 messages
#1Peter Eisentraut
peter@eisentraut.org

Change timeline field of IDENTIFY_SYSTEM to int8

It was int4, but in the other replication commands, timelines are
returned as int8.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: /messages/by-id/7e4fdbdc-699c-4cd0-115d-fb78a957fc22@enterprisedb.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/ec40f3422412cfdc140b5d3f67db7fd2dac0f1e2

Modified Files
--------------
doc/src/sgml/protocol.sgml | 2 +-
src/backend/replication/walsender.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: pgsql: Change timeline field of IDENTIFY_SYSTEM to int8

Peter Eisentraut <peter@eisentraut.org> writes:

Change timeline field of IDENTIFY_SYSTEM to int8

Surely this patch is far from complete?

To start with, just a few lines down in IdentifySystem() the column
is filled using Int32GetDatum not Int64GetDatum. I will get some
popcorn and await the opinions of the 32-bit buildfarm animals.

But what about whatever code is reading the output? And what if
that code isn't v16? I can't believe that we can make a wire
protocol change as summarily as this.

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: pgsql: Change timeline field of IDENTIFY_SYSTEM to int8

I wrote:

To start with, just a few lines down in IdentifySystem() the column
is filled using Int32GetDatum not Int64GetDatum. I will get some
popcorn and await the opinions of the 32-bit buildfarm animals.

Didn't need to wait long:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&amp;dt=2022-07-04%2005%3A39%3A50

The other questions still stand.

regards, tom lane

#4Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#2)
Re: pgsql: Change timeline field of IDENTIFY_SYSTEM to int8

On 04.07.22 07:55, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

Change timeline field of IDENTIFY_SYSTEM to int8

Surely this patch is far from complete?

To start with, just a few lines down in IdentifySystem() the column
is filled using Int32GetDatum not Int64GetDatum. I will get some
popcorn and await the opinions of the 32-bit buildfarm animals.

But what about whatever code is reading the output? And what if
that code isn't v16? I can't believe that we can make a wire
protocol change as summarily as this.

I think a client will either just read the string value and convert it
to some numeric type without checking what type was actually sent, or if
the client API is type-aware and automatically converts to a native type
of some sort, then it will probably already support 64-bit ints. Do you
see some problem scenario?

I'm seeing a bigger problem now, which is that our client code doesn't
parse bigger-than-int32 timeline IDs correctly.

libpqwalreceiver uses pg_strtoint32(), which will error on overflow.

pg_basebackup uses atoi(), so it will just truncate the value, except
for READ_REPLICATION_SLOT, where it uses atol(), so it will do the wrong
thing on Windows only.

There is clearly very little use for such near-overflow timeline IDs in
practice. But it still seems pretty inconsistent.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: pgsql: Change timeline field of IDENTIFY_SYSTEM to int8

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 04.07.22 07:55, Tom Lane wrote:

But what about whatever code is reading the output? And what if
that code isn't v16? I can't believe that we can make a wire
protocol change as summarily as this.

I think a client will either just read the string value and convert it
to some numeric type without checking what type was actually sent, or if
the client API is type-aware and automatically converts to a native type
of some sort, then it will probably already support 64-bit ints. Do you
see some problem scenario?

If the result of IDENTIFY_SYSTEM is always sent in text format, then
I agree that this isn't very problematic. If there are any clients
that fetch it in binary mode, though, this is absolutely a wire
protocol break for them ... and no, I don't believe an unsupported
claim that they'd adapt automatically.

I'm seeing a bigger problem now, which is that our client code doesn't
parse bigger-than-int32 timeline IDs correctly.

Yup.

regards, tom lane

#6Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#5)
Re: pgsql: Change timeline field of IDENTIFY_SYSTEM to int8

On 04.07.22 19:32, Tom Lane wrote:

If the result of IDENTIFY_SYSTEM is always sent in text format, then
I agree that this isn't very problematic. If there are any clients
that fetch it in binary mode, though, this is absolutely a wire
protocol break for them

The result rows of the replication commands are always sent in text format.

#7Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#2)
Re: pgsql: Change timeline field of IDENTIFY_SYSTEM to int8

On Mon, Jul 04, 2022 at 01:55:13AM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

Change timeline field of IDENTIFY_SYSTEM to int8

Surely this patch is far from complete?

Yeah..

But what about whatever code is reading the output? And what if
that code isn't v16? I can't believe that we can make a wire
protocol change as summarily as this.

Assuming that one reaches a timeline of 2 billion, this change would
make the TLI consumption of the client safe to signedness. But why is
it safe to do a protocol change when running IDENTIFY_SYSTEM? We've
been very strict to maintain compatibility for any protocol change,
hence why should the replication protocol be treated differently?
--
Michael