Binary support for pgoutput plugin

Started by Dave Crameralmost 7 years ago93 messageshackers
Jump to latest
#1Dave Cramer
pg@fastcrypt.com

Is there a reason why pgoutput sends data in text format? Seems to me that
sending data in binary would provide a considerable performance improvement.

Dave Cramer

#2David Fetter
david@fetter.org
In reply to: Dave Cramer (#1)
Re: Binary support for pgoutput plugin

On Mon, Jun 03, 2019 at 10:49:54AM -0400, Dave Cramer wrote:

Is there a reason why pgoutput sends data in text format? Seems to
me that sending data in binary would provide a considerable
performance improvement.

Are you seeing something that suggests that the text output is taking
a lot of time or other resources?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#3Dave Cramer
pg@fastcrypt.com
In reply to: David Fetter (#2)
Re: Binary support for pgoutput plugin

Dave Cramer

On Mon, 3 Jun 2019 at 20:54, David Fetter <david@fetter.org> wrote:

On Mon, Jun 03, 2019 at 10:49:54AM -0400, Dave Cramer wrote:

Is there a reason why pgoutput sends data in text format? Seems to
me that sending data in binary would provide a considerable
performance improvement.

Are you seeing something that suggests that the text output is taking
a lot of time or other resources?

Actually it's on the other end that there is improvement. Parsing text

takes much longer for almost everything except ironically text.

To be more transparent there is some desire to use pgoutput for something
other than logical replication. Change Data Capture clients such as
Debezium have a requirement for a stable plugin which is shipped with core
as this is always available in cloud providers offerings. There's no reason
that I am aware of that they cannot use pgoutput for this. There's also no
reason that I am aware that binary outputs can't be supported. The protocol
would have to change slightly and I am working on a POC patch.

Thing is they aren't all written in C so using binary does provide a pretty
substantial win on the decoding end.

Dave

#4Andres Freund
andres@anarazel.de
In reply to: Dave Cramer (#3)
Re: Binary support for pgoutput plugin

Hi,

On 2019-06-04 15:47:04 -0400, Dave Cramer wrote:

On Mon, 3 Jun 2019 at 20:54, David Fetter <david@fetter.org> wrote:

On Mon, Jun 03, 2019 at 10:49:54AM -0400, Dave Cramer wrote:

Is there a reason why pgoutput sends data in text format? Seems to
me that sending data in binary would provide a considerable
performance improvement.

Are you seeing something that suggests that the text output is taking
a lot of time or other resources?

Actually it's on the other end that there is improvement. Parsing text

takes much longer for almost everything except ironically text.

It's on both sides, I'd say. E.g. float (until v12), timestamp, bytea
are all much more expensive to convert from binary to text.

To be more transparent there is some desire to use pgoutput for something
other than logical replication. Change Data Capture clients such as
Debezium have a requirement for a stable plugin which is shipped with core
as this is always available in cloud providers offerings. There's no reason
that I am aware of that they cannot use pgoutput for this.

Except that that's not pgoutput's purpose, and we shouldn't make it
meaningfully more complicated or slower to achieve this. Don't think
there's a conflict in this case though.

There's also no reason that I am aware that binary outputs can't be
supported.

Well, it *does* increase version dependencies, and does make replication
more complicated, because type oids etc cannot be relied to be the same
on source and target side.

The protocol would have to change slightly and I am working
on a POC patch.

Hm, what would have to be changed protocol wise? IIRC that'd just be a
different datum type? Or is that what you mean?
pq_sendbyte(out, 't'); /* 'text' data follows */

IIRC there was code for the binary protocol in a predecessor of
pgoutput.

I think if we were to add binary output - and I think we should - we
ought to only accept a patch if it's also used in core.

Greetings,

Andres Freund

#5Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#1)
Re: Binary support for pgoutput plugin

Dave Cramer

On Tue, 4 Jun 2019 at 16:30, Andres Freund <andres.freund@enterprisedb.com>
wrote:

Hi,

On 2019-06-04 15:47:04 -0400, Dave Cramer wrote:

On Mon, 3 Jun 2019 at 20:54, David Fetter <david@fetter.org> wrote:

On Mon, Jun 03, 2019 at 10:49:54AM -0400, Dave Cramer wrote:

Is there a reason why pgoutput sends data in text format? Seems to
me that sending data in binary would provide a considerable
performance improvement.

Are you seeing something that suggests that the text output is taking
a lot of time or other resources?

Actually it's on the other end that there is improvement. Parsing text

takes much longer for almost everything except ironically text.

It's on both sides, I'd say. E.g. float (until v12), timestamp, bytea
are all much more expensive to convert from binary to text.

To be more transparent there is some desire to use pgoutput for something
other than logical replication. Change Data Capture clients such as
Debezium have a requirement for a stable plugin which is shipped with

core

as this is always available in cloud providers offerings. There's no

reason

that I am aware of that they cannot use pgoutput for this.

Except that that's not pgoutput's purpose, and we shouldn't make it
meaningfully more complicated or slower to achieve this. Don't think
there's a conflict in this case though.

agreed, my intent was to slightly bend it to my will :)

There's also no reason that I am aware that binary outputs can't be
supported.

Well, it *does* increase version dependencies, and does make replication
more complicated, because type oids etc cannot be relied to be the same
on source and target side.

I was about to agree with this but if the type oids change from source to

target you
still can't decode the text version properly. Unless I mis-understand
something here ?

The protocol would have to change slightly and I am working
on a POC patch.

Hm, what would have to be changed protocol wise? IIRC that'd just be a
different datum type? Or is that what you mean?
pq_sendbyte(out, 't'); /* 'text' data follows */

I haven't really thought this through completely but one place JDBC has

problems with binary is with
timestamps with timezone as we don't know which timezone to use. Is it safe
to assume everything is in UTC
since the server stores in UTC ? Then there are UDF's. My original thought
was to use options to send in the
types that I wanted in binary, everything else could be sent as text.

IIRC there was code for the binary protocol in a predecessor of

pgoutput.

Hmmm that might be good place to start. I will do some digging through git
history

I think if we were to add binary output - and I think we should - we
ought to only accept a patch if it's also used in core.

Certainly; as not doing so would make my work completely irrelevant for my
purpose.

Thanks,

Dave

#6Andres Freund
andres@anarazel.de
In reply to: Dave Cramer (#5)
Re: Binary support for pgoutput plugin

Hi,

On 2019-06-04 16:39:32 -0400, Dave Cramer wrote:

On Tue, 4 Jun 2019 at 16:30, Andres Freund <andres.freund@enterprisedb.com>
wrote:

There's also no reason that I am aware that binary outputs can't be
supported.

Well, it *does* increase version dependencies, and does make replication
more complicated, because type oids etc cannot be relied to be the same
on source and target side.

I was about to agree with this but if the type oids change from source
to target you still can't decode the text version properly. Unless I
mis-understand something here ?

The text format doesn't care about oids. I don't see how it'd be a
problem? Note that some people *intentionally* use different types from
source to target system when logically replicating. So you can't rely on
the target table's types under any circumstance.

I think you really have to use the textual type which we already write
out (cf logicalrep_write_typ()) to call the binary input functions. And
you can send only data as binary that's from builtin types - otherwise
there's no guarantee at all that the target system has something
compatible. And even if you just assumed that all extensions etc are
present, you can't transport arrays / composite types in binary: For
hard to discern reasons we a) embed type oids in them b) verify them. b)
won't ever work for non-builtin types, because oids are assigned
dynamically.

I think if we were to add binary output - and I think we should - we
ought to only accept a patch if it's also used in core.

Certainly; as not doing so would make my work completely irrelevant for my
purpose.

What I mean is that the builtin logical replication would have to use
this on the receiving side too.

Greetings,

Andres Freund

#7Dave Cramer
pg@fastcrypt.com
In reply to: Andres Freund (#6)
Re: Binary support for pgoutput plugin

On Tue, 4 Jun 2019 at 16:46, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-06-04 16:39:32 -0400, Dave Cramer wrote:

On Tue, 4 Jun 2019 at 16:30, Andres Freund <

andres.freund@enterprisedb.com>

wrote:

There's also no reason that I am aware that binary outputs can't be
supported.

Well, it *does* increase version dependencies, and does make

replication

more complicated, because type oids etc cannot be relied to be the same
on source and target side.

I was about to agree with this but if the type oids change from source
to target you still can't decode the text version properly. Unless I
mis-understand something here ?

The text format doesn't care about oids. I don't see how it'd be a
problem? Note that some people *intentionally* use different types from
source to target system when logically replicating. So you can't rely on
the target table's types under any circumstance.

I think you really have to use the textual type which we already write
out (cf logicalrep_write_typ()) to call the binary input functions. And
you can send only data as binary that's from builtin types - otherwise
there's no guarantee at all that the target system has something
compatible. And even if you just assumed that all extensions etc are
present, you can't transport arrays / composite types in binary: For
hard to discern reasons we a) embed type oids in them b) verify them. b)
won't ever work for non-builtin types, because oids are assigned
dynamically.

I figured arrays and UDT's would be problematic.

I think if we were to add binary output - and I think we should - we
ought to only accept a patch if it's also used in core.

Certainly; as not doing so would make my work completely irrelevant for

my

purpose.

What I mean is that the builtin logical replication would have to use
this on the receiving side too.

Got it, thanks for validating that the idea isn't nuts. Now I *have* to

produce a POC.

Thanks,
Dave

Show quoted text
#8Chapman Flack
chap@anastigmatix.net
In reply to: Dave Cramer (#5)
Re: Binary support for pgoutput plugin

On 6/4/19 4:39 PM, Dave Cramer wrote:

I haven't really thought this through completely but one place JDBC has
problems with binary is with
timestamps with timezone as we don't know which timezone to use. Is it safe
to assume everything is in UTC
since the server stores in UTC ?

PL/Java, when converting to the Java 8 java.time types (because those
are sane), will turn a timestamp with timezone into an OffsetDateTime
with explicit offset zero (UTC), no matter what timezone may have been
used when the value was input (as you've observed, there's no way to
recover that). In the return direction, if given an OffsetDateTime
with any nonzero offset, it will adjust the value to UTC for postgres.

So, yes, say I.

Regards,
-Chap

#9David Fetter
david@fetter.org
In reply to: Dave Cramer (#7)
Re: Binary support for pgoutput plugin

On Tue, Jun 04, 2019 at 04:55:33PM -0400, Dave Cramer wrote:

On Tue, 4 Jun 2019 at 16:46, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-06-04 16:39:32 -0400, Dave Cramer wrote:

On Tue, 4 Jun 2019 at 16:30, Andres Freund <

andres.freund@enterprisedb.com>

wrote:

There's also no reason that I am aware that binary outputs can't be
supported.

Well, it *does* increase version dependencies, and does make

replication

more complicated, because type oids etc cannot be relied to be the same
on source and target side.

I was about to agree with this but if the type oids change from source
to target you still can't decode the text version properly. Unless I
mis-understand something here ?

The text format doesn't care about oids. I don't see how it'd be a
problem? Note that some people *intentionally* use different types from
source to target system when logically replicating. So you can't rely on
the target table's types under any circumstance.

I think you really have to use the textual type which we already write
out (cf logicalrep_write_typ()) to call the binary input functions. And
you can send only data as binary that's from builtin types - otherwise
there's no guarantee at all that the target system has something
compatible. And even if you just assumed that all extensions etc are
present, you can't transport arrays / composite types in binary: For
hard to discern reasons we a) embed type oids in them b) verify them. b)
won't ever work for non-builtin types, because oids are assigned
dynamically.

I figured arrays and UDT's would be problematic.

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format is
not standardized across, for example, big- and little-endian machines.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#10Andres Freund
andres@anarazel.de
In reply to: David Fetter (#9)
Re: Binary support for pgoutput plugin

Hi,

On 2019-06-05 00:05:02 +0200, David Fetter wrote:

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format is
not standardized across, for example, big- and little-endian machines.

I think you recall wrongly. It's obviously possible that we have bugs
around this, but output/input routines are supposed to handle a
endianess independent format. That usually means that you have to do
endianess conversions, but that doesn't make it non-standardized.

- Andres

#11Dave Cramer
pg@fastcrypt.com
In reply to: Andres Freund (#10)
Re: Binary support for pgoutput plugin

On Tue, 4 Jun 2019 at 18:08, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-06-05 00:05:02 +0200, David Fetter wrote:

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format is
not standardized across, for example, big- and little-endian machines.

I think you recall wrongly. It's obviously possible that we have bugs
around this, but output/input routines are supposed to handle a
endianess independent format. That usually means that you have to do
endianess conversions, but that doesn't make it non-standardized.

Additionally there are a number of drivers that already know how to handle
our binary types.
I don't really think there's a win here. I also want to keep the changes
small .

Dave

#12Petr Jelinek
petr@2ndquadrant.com
In reply to: Andres Freund (#10)
Re: Binary support for pgoutput plugin

Hi,

On 05/06/2019 00:08, Andres Freund wrote:

Hi,

On 2019-06-05 00:05:02 +0200, David Fetter wrote:

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format is
not standardized across, for example, big- and little-endian machines.

I think you recall wrongly. It's obviously possible that we have bugs
around this, but output/input routines are supposed to handle a
endianess independent format. That usually means that you have to do
endianess conversions, but that doesn't make it non-standardized.

Yeah, there are really 3 formats of data we have, text protocol, binary
network protocol and internal on disk format. The internal on disk
format will not work across big/little-endian but network binary
protocol will.

FWIW I don't think the code for binary format was included in original
logical replication patch (I really tried to keep it as minimal as
possible), but the code and protocol is pretty much ready for adding that.

That said, pglogical has code which handles this (I guess Andres means
that by predecessor of pgoutput) so if you look for example at the
write_tuple/read_tuple/decide_datum_transfer in
https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_proto_native.c
that can help you give some ideas on how to approach this.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions
https://www.2ndQuadrant.com/

#13Dave Cramer
pg@fastcrypt.com
In reply to: Petr Jelinek (#12)
Re: Binary support for pgoutput plugin

Hi,

On Wed, 5 Jun 2019 at 07:18, Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:

Hi,

On 05/06/2019 00:08, Andres Freund wrote:

Hi,

On 2019-06-05 00:05:02 +0200, David Fetter wrote:

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format is
not standardized across, for example, big- and little-endian machines.

I think you recall wrongly. It's obviously possible that we have bugs
around this, but output/input routines are supposed to handle a
endianess independent format. That usually means that you have to do
endianess conversions, but that doesn't make it non-standardized.

Yeah, there are really 3 formats of data we have, text protocol, binary
network protocol and internal on disk format. The internal on disk
format will not work across big/little-endian but network binary
protocol will.

FWIW I don't think the code for binary format was included in original
logical replication patch (I really tried to keep it as minimal as
possible), but the code and protocol is pretty much ready for adding that.

Yes, I looked through the public history and could not find it. Thanks for
confirming.

That said, pglogical has code which handles this (I guess Andres means
that by predecessor of pgoutput) so if you look for example at the
write_tuple/read_tuple/decide_datum_transfer in

https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_proto_native.c
that can help you give some ideas on how to approach this.

Thanks for the tip!

Dave Cramer

Show quoted text
#14Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#13)
Re: Binary support for pgoutput plugin

On Wed, 5 Jun 2019 at 07:21, Dave Cramer <davecramer@gmail.com> wrote:

Hi,

On Wed, 5 Jun 2019 at 07:18, Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:

Hi,

On 05/06/2019 00:08, Andres Freund wrote:

Hi,

On 2019-06-05 00:05:02 +0200, David Fetter wrote:

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format is
not standardized across, for example, big- and little-endian machines.

I think you recall wrongly. It's obviously possible that we have bugs
around this, but output/input routines are supposed to handle a
endianess independent format. That usually means that you have to do
endianess conversions, but that doesn't make it non-standardized.

Yeah, there are really 3 formats of data we have, text protocol, binary
network protocol and internal on disk format. The internal on disk
format will not work across big/little-endian but network binary
protocol will.

FWIW I don't think the code for binary format was included in original
logical replication patch (I really tried to keep it as minimal as
possible), but the code and protocol is pretty much ready for adding that.

Yes, I looked through the public history and could not find it. Thanks for
confirming.

That said, pglogical has code which handles this (I guess Andres means
that by predecessor of pgoutput) so if you look for example at the
write_tuple/read_tuple/decide_datum_transfer in

https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_proto_native.c
that can help you give some ideas on how to approach this.

Thanks for the tip!

Looking at:
https://github.com/postgres/postgres/blob/8255c7a5eeba8f1a38b7a431c04909bde4f5e67d/src/backend/replication/pgoutput/pgoutput.c#L163

this seems completely ignored. What was the intent?

Dave

#15Andres Freund
andres@anarazel.de
In reply to: Dave Cramer (#14)
Re: Binary support for pgoutput plugin

Hi

On June 5, 2019 8:51:10 AM PDT, Dave Cramer <davecramer@gmail.com> wrote:

On Wed, 5 Jun 2019 at 07:21, Dave Cramer <davecramer@gmail.com> wrote:

Hi,

On Wed, 5 Jun 2019 at 07:18, Petr Jelinek

<petr.jelinek@2ndquadrant.com>

wrote:

Hi,

On 05/06/2019 00:08, Andres Freund wrote:

Hi,

On 2019-06-05 00:05:02 +0200, David Fetter wrote:

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format

is

not standardized across, for example, big- and little-endian

machines.

I think you recall wrongly. It's obviously possible that we have

bugs

around this, but output/input routines are supposed to handle a
endianess independent format. That usually means that you have to

do

endianess conversions, but that doesn't make it non-standardized.

Yeah, there are really 3 formats of data we have, text protocol,

binary

network protocol and internal on disk format. The internal on disk
format will not work across big/little-endian but network binary
protocol will.

FWIW I don't think the code for binary format was included in

original

logical replication patch (I really tried to keep it as minimal as
possible), but the code and protocol is pretty much ready for adding

that.

Yes, I looked through the public history and could not find it.

Thanks for

confirming.

That said, pglogical has code which handles this (I guess Andres

means

that by predecessor of pgoutput) so if you look for example at the
write_tuple/read_tuple/decide_datum_transfer in

https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_proto_native.c

that can help you give some ideas on how to approach this.

Thanks for the tip!

Looking at:
https://github.com/postgres/postgres/blob/8255c7a5eeba8f1a38b7a431c04909bde4f5e67d/src/backend/replication/pgoutput/pgoutput.c#L163

this seems completely ignored. What was the intent?

That's about the output of the plugin, not the datatypes. And independent of text/binary output, the protocol contains non-printable chars.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#16Dave Cramer
pg@fastcrypt.com
In reply to: Andres Freund (#15)
Re: Binary support for pgoutput plugin

Hi,

On Wed, 5 Jun 2019 at 12:01, Andres Freund <andres@anarazel.de> wrote:

Hi

On June 5, 2019 8:51:10 AM PDT, Dave Cramer <davecramer@gmail.com> wrote:

On Wed, 5 Jun 2019 at 07:21, Dave Cramer <davecramer@gmail.com> wrote:

Hi,

On Wed, 5 Jun 2019 at 07:18, Petr Jelinek

<petr.jelinek@2ndquadrant.com>

wrote:

Hi,

On 05/06/2019 00:08, Andres Freund wrote:

Hi,

On 2019-06-05 00:05:02 +0200, David Fetter wrote:

Would it make sense to work toward a binary format that's not
architecture-specific? I recall from COPY that our binary format

is

not standardized across, for example, big- and little-endian

machines.

I think you recall wrongly. It's obviously possible that we have

bugs

around this, but output/input routines are supposed to handle a
endianess independent format. That usually means that you have to

do

endianess conversions, but that doesn't make it non-standardized.

Yeah, there are really 3 formats of data we have, text protocol,

binary

network protocol and internal on disk format. The internal on disk
format will not work across big/little-endian but network binary
protocol will.

FWIW I don't think the code for binary format was included in

original

logical replication patch (I really tried to keep it as minimal as
possible), but the code and protocol is pretty much ready for adding

that.

Yes, I looked through the public history and could not find it.

Thanks for

confirming.

That said, pglogical has code which handles this (I guess Andres

means

that by predecessor of pgoutput) so if you look for example at the
write_tuple/read_tuple/decide_datum_transfer in

https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_proto_native.c

that can help you give some ideas on how to approach this.

Thanks for the tip!

Looking at:

https://github.com/postgres/postgres/blob/8255c7a5eeba8f1a38b7a431c04909bde4f5e67d/src/backend/replication/pgoutput/pgoutput.c#L163

this seems completely ignored. What was the intent?

That's about the output of the plugin, not the datatypes. And independent
of text/binary output, the protocol contains non-printable chars.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

So one of the things they would like added is to get not null information
in the schema record. This is so they can mark the field Optional in Java.
I presume this would also have some uses in other languages. As I
understand it this would require a protocol bump. If this were to be
accepted are there any outstanding asks that would useful to add if we were
going to bump the protocol?

Dave

#17Andres Freund
andres@anarazel.de
In reply to: Dave Cramer (#16)
Re: Binary support for pgoutput plugin

Hi,

On 2019-06-05 18:47:57 -0400, Dave Cramer wrote:

So one of the things they would like added is to get not null information
in the schema record. This is so they can mark the field Optional in Java.
I presume this would also have some uses in other languages. As I
understand it this would require a protocol bump. If this were to be
accepted are there any outstanding asks that would useful to add if we were
going to bump the protocol?

I'm pretty strongly opposed to this. What's the limiting factor when
adding such information? I think clients that want something like this
ought to query the database for catalog information when getting schema
information.

Greetings,

Andres Freund

#18Dave Cramer
pg@fastcrypt.com
In reply to: Andres Freund (#17)
Re: Binary support for pgoutput plugin

Hi,

On Wed, 5 Jun 2019 at 18:50, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-06-05 18:47:57 -0400, Dave Cramer wrote:

So one of the things they would like added is to get not null information
in the schema record. This is so they can mark the field Optional in

Java.

I presume this would also have some uses in other languages. As I
understand it this would require a protocol bump. If this were to be
accepted are there any outstanding asks that would useful to add if we

were

going to bump the protocol?

I'm pretty strongly opposed to this. What's the limiting factor when
adding such information? I think clients that want something like this
ought to query the database for catalog information when getting schema
information.

I'm not intimately familiar with their code. I will query them more about
the ask.

I am curious why you are "strongly" opposed however. We already have the
information. Adding doesn't seem onerous.

Dave

#19Andres Freund
andres@anarazel.de
In reply to: Dave Cramer (#18)
Re: Binary support for pgoutput plugin

Hi,

On 2019-06-05 19:05:05 -0400, Dave Cramer wrote:

I am curious why you are "strongly" opposed however. We already have the
information. Adding doesn't seem onerous.

(thought I'd already replied with this)

The problem is that I don't recognize a limiting principle:

If we want NOT NULL information for clients, why don't we include the
underlying types for arrays, and the fields in composite types? What
about foreign keys? And unique keys?

And then we suddenly need tracking for all these, so we don't always
send out that information when we previously already did - and in some
of the cases there's no infrastructure for that.

I just don't quite buy that the output plugin build for pg's logical
replication needs is a good place to include a continually increasing
amount of metadata that logical replication doesn't need. That's going
to add overhead and make the code more complicated.

Greetings,

Andres Freund

#20Chapman Flack
chap@anastigmatix.net
In reply to: Andres Freund (#19)
Re: Binary support for pgoutput plugin

On 06/07/19 19:27, Andres Freund wrote:

The problem is that I don't recognize a limiting principle:

If we want NOT NULL information for clients, why don't we include the
underlying types for arrays, and the fields in composite types? What
about foreign keys? And unique keys?

This reminds me of an idea I had for a future fe/be protocol version,
right after a talk by Alyssa Ritchie and Henrietta Dombrovskaya at the
last 2Q PGConf. [1]https://www.2qpgconf.com/schedule/information-exchange-techniques-for-javapostgresql-applications/

It seems they had ended up designing a whole 'nother "protocol level"
involving queries wrapping their results as JSON and an app layer that
unwraps again, after trying a simpler first approach that was foiled by the
inability to see into arrays and anonymous record types in the 'describe'
response.

I thought, in a new protocol rev, why not let the driver send additional
'describe' messages after the first one, to drill into structure of
individual columns mentioned in the first response, before sending the
'execute' message?

If it doesn't want the further detail, it doesn't have to ask.

And then we suddenly need tracking for all these, so we don't always
send out that information when we previously already did

If it's up to the client driver, it can track what it needs or already has.

I haven't looked too deeply into the replication protocol ... it happens
under a kind of copy-both, right?, so maybe there's a way for the receiver
to send some inquiries back, but maybe in a windowed, full-duplex way where
it might have to buffer some incoming messages before getting the response
to an inquiry message it sent.

Would those be thinkable thoughts for a future protocol rev?

Regards,
-Chap

[1]: https://www.2qpgconf.com/schedule/information-exchange-techniques-for-javapostgresql-applications/
https://www.2qpgconf.com/schedule/information-exchange-techniques-for-javapostgresql-applications/

#21Andres Freund
andres@anarazel.de
In reply to: Chapman Flack (#20)
#22Chapman Flack
chap@anastigmatix.net
In reply to: Andres Freund (#21)
#23Andres Freund
andres@anarazel.de
In reply to: Chapman Flack (#22)
#24Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#21)
#25Dave Cramer
pg@fastcrypt.com
In reply to: Petr Jelinek (#12)
#26Andres Freund
andres@anarazel.de
In reply to: Dave Cramer (#25)
#27Dave Cramer
pg@fastcrypt.com
In reply to: Andres Freund (#26)
#28Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Dave Cramer (#27)
#29Dave Cramer
pg@fastcrypt.com
In reply to: Tomas Vondra (#28)
#30Petr Jelinek
petr@2ndquadrant.com
In reply to: Dave Cramer (#29)
#31Dave Cramer
pg@fastcrypt.com
In reply to: Petr Jelinek (#30)
#32Dave Cramer
pg@fastcrypt.com
In reply to: Petr Jelinek (#30)
#33Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Dave Cramer (#32)
#34Dave Cramer
pg@fastcrypt.com
In reply to: Tomas Vondra (#33)
#35Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#34)
#36Dave Cramer
pg@fastcrypt.com
In reply to: Andres Freund (#17)
#37Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Dave Cramer (#35)
#38Dave Cramer
pg@fastcrypt.com
In reply to: Dmitry Dolgov (#37)
#39Thomas Munro
thomas.munro@gmail.com
In reply to: Dave Cramer (#38)
#40Dave Cramer
pg@fastcrypt.com
In reply to: Thomas Munro (#39)
#41Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Dave Cramer (#40)
#42Dave Cramer
pg@fastcrypt.com
In reply to: Dmitry Dolgov (#41)
#43Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dave Cramer (#42)
#44Dave Cramer
pg@fastcrypt.com
In reply to: Alvaro Herrera (#43)
#45Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#44)
#46Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Dave Cramer (#42)
#47Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dave Cramer (#45)
#48Dave Cramer
pg@fastcrypt.com
In reply to: Alvaro Herrera (#47)
#49Michael Paquier
michael@paquier.xyz
In reply to: Dave Cramer (#48)
#50Dave Cramer
pg@fastcrypt.com
In reply to: Michael Paquier (#49)
#51Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#50)
#52Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#50)
#53Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#52)
#54Petr Jelinek
petr@2ndquadrant.com
In reply to: Dave Cramer (#53)
#55Dave Cramer
pg@fastcrypt.com
In reply to: Petr Jelinek (#54)
#56Petr Jelinek
petr@2ndquadrant.com
In reply to: Dave Cramer (#55)
#57Dave Cramer
pg@fastcrypt.com
In reply to: Petr Jelinek (#56)
#58Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#57)
#59Daniel Gustafsson
daniel@yesql.se
In reply to: Dave Cramer (#58)
#60Dave Cramer
pg@fastcrypt.com
In reply to: Daniel Gustafsson (#59)
#61Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#60)
#62Daniel Gustafsson
daniel@yesql.se
In reply to: Dave Cramer (#61)
#63Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Daniel Gustafsson (#62)
#64Daniel Gustafsson
daniel@yesql.se
In reply to: Alvaro Herrera (#63)
#65Dave Cramer
pg@fastcrypt.com
In reply to: Daniel Gustafsson (#64)
#66Daniel Gustafsson
daniel@yesql.se
In reply to: Dave Cramer (#65)
#67Dave Cramer
pg@fastcrypt.com
In reply to: Daniel Gustafsson (#66)
#68Dave Cramer
pg@fastcrypt.com
In reply to: Dave Cramer (#67)
#69Daniel Gustafsson
daniel@yesql.se
In reply to: Dave Cramer (#68)
#70Dave Cramer
pg@fastcrypt.com
In reply to: Daniel Gustafsson (#69)
#71Daniel Gustafsson
daniel@yesql.se
In reply to: Dave Cramer (#70)
#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#71)
#73Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#72)
#74Petr Jelinek
petr@2ndquadrant.com
In reply to: Dave Cramer (#73)
#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Petr Jelinek (#74)
#76Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#75)
#77Daniel Gustafsson
daniel@yesql.se
In reply to: Dave Cramer (#76)
#78Dave Cramer
pg@fastcrypt.com
In reply to: Daniel Gustafsson (#77)
#79Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#78)
#80Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#79)
#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#80)
#82Andres Freund
andres@anarazel.de
In reply to: Dave Cramer (#80)
#83Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#82)
#84Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#83)
#85Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#84)
#86Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#85)
#87Dave Cramer
pg@fastcrypt.com
In reply to: Andres Freund (#86)
#88Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#87)
#89Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#87)
In reply to: Tom Lane (#89)
#91Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#90)
#92Petr Jelinek
petr@2ndquadrant.com
In reply to: Tom Lane (#91)
#93Tom Lane
tgl@sss.pgh.pa.us
In reply to: Petr Jelinek (#92)