Performance penalty when requesting text values in binary format

Started by Jack Christensenover 5 years ago3 messages
#1Jack Christensen
jack@jncsoftware.com

I'm the creator of the PostgreSQL driver pgx (https://github.com/jackc/pgx)
for the Go language. I have found significant performance advantages to
using the extended protocol and binary format values -- in particular for
types such as timestamptz.

However, I was recently very surprised to find that it is significantly
slower to select a text type value in the binary format. For an example
case of selecting 1,000 rows each with 5 text columns of 16 bytes each the
application time from sending the query to having received the entire
response is approximately 16% slower. Here is a link to the test benchmark:
https://github.com/jackc/pg_text_binary_bench

Given that the text and binary formats for the text type are identical I
would not have expected any performance differences.

My C is rusty and my knowledge of the PG server internals is minimal but
the performance difference appears to be that function textsend creates an
extra copy where textout simply returns a pointer to the existing data.
This seems to be superfluous.

I can work around this by specifying the format per result column instead
of specifying binary for all but this performance bug / anomaly seemed
worth reporting.

Jack

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Jack Christensen (#1)
Re: Performance penalty when requesting text values in binary format

On Sat, 2020-05-16 at 20:12 -0500, Jack Christensen wrote:

I'm the creator of the PostgreSQL driver pgx (https://github.com/jackc/pgx) for the Go language.
I have found significant performance advantages to using the extended protocol and binary format
values -- in particular for types such as timestamptz.

However, I was recently very surprised to find that it is significantly slower to select a text
type value in the binary format. For an example case of selecting 1,000 rows each with 5 text
columns of 16 bytes each the application time from sending the query to having received the
entire response is approximately 16% slower. Here is a link to the test benchmark:
https://github.com/jackc/pg_text_binary_bench

Given that the text and binary formats for the text type are identical I would not have
expected any performance differences.

My C is rusty and my knowledge of the PG server internals is minimal but the performance
difference appears to be that function textsend creates an extra copy where textout
simply returns a pointer to the existing data. This seems to be superfluous.

I can work around this by specifying the format per result column instead of specifying
binary for all but this performance bug / anomaly seemed worth reporting.

Did you profile your benchmark?
It would be interesting to know where the time is spent.

Yours,
Laurenz Albe

#3Jack Christensen
jack@jncsoftware.com
In reply to: Laurenz Albe (#2)
Re: Performance penalty when requesting text values in binary format

On Mon, May 18, 2020 at 7:07 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Did you profile your benchmark?
It would be interesting to know where the time is spent.

Unfortunately, I have not. Fortunately, it appears that Tom Lane recognized
this as a part of another issue and has prepared a patch.

/messages/by-id/6648.1589819885@sss.pgh.pa.us

Thanks,
Jack