SendRowDescriptionMessage() is slow for queries with a lot of columns
Hi,
When running workloads that include fast queries with a lot of columns,
SendRowDescriptionMessage(), and the routines it calls, becomes a
bottleneck. Besides syscache lookups (see [1]http://archives.postgresql.org/message-id/CA+Tgmobj72E_tG6w98H0oUbCCUmoC4uRmjocYPbnWC2RxYACeg@mail.gmail.com and [2]http://archives.postgresql.org/message-id/20170914061207.zxotvyopetm7lrrp%40alap3.anarazel.de) a major cost of
that is manipulation of the StringBuffer and the version specific
branches in the per-attribute loop. As so often, the performance
differential of this patch gets bigger when the other performance
patches are applied.
The issues in SendRowDescriptionMessage() are the following:
1) All the pq_sendint calls, and also the pq_sendstring, are
expensive. The amount of calculations done for a single 2/4 byte
addition to the stringbuffer (particularly enlargeStringInfo()) is
problematic, as are the reallocations themselves.
I addressed this by adding pq_send*_pre() wrappers, implemented as
inline functions, that require that memory is pre-allocated.
Combining that with doing a enlargeStringInfo() in
SendRowDescriptionMessage() that pre-allocates the maximum required
memory, yields pretty good speedup.
I'm not yet super sure about the implementation. For one, I'm not
sure this shouldn't instead be stringinfo.h functions, with very very
tiny pqformat.h wrappers. But conversely I think it'd make a lot of
sense for the pqformat integer functions to get rid of the
continually maintained trailing null-byte - I was hoping the compiler
could optimize that away, but alas, no luck. As soon as a single
integer is sent, you can't rely on 0 terminated strings anyway.
2) It creates a new StringInfo in every iteration. That causes
noticeable memory management overhead, and restarts the size of the
buffer every time. When the description is relatively large, that
leads to a number of reallocations for every
SendRowDescriptionMessage() call.
My solution here was to create persistent StringInfo for
SendRowDescriptionMessage() that never gets deallocated (just
reset). That in combination with new versions of
pq_beginmessage/endmessage that keep the buffer alive, yields a nice
speedup.
Currently I'm using a static variable to allocate a string buffer for
the function. It'd probably better to manage that outside of a single
function - I'm also wondering why we're allocating a good number of
stringinfos in various places, rather than reuse them. Most of them
can't be entered recursively, and even if that's a concern, we could
have one reusable per portal or such.
3) The v2/v3 branches in the attribute loop are noticeable (others too,
but well...). I solved that by splitting out the v2 and v3
per-attribute loops into separate functions. Imo also a good chunk
more readable.
Comments?
Greetings,
Andres Freund
[1]: http://archives.postgresql.org/message-id/CA+Tgmobj72E_tG6w98H0oUbCCUmoC4uRmjocYPbnWC2RxYACeg@mail.gmail.com
[2]: http://archives.postgresql.org/message-id/20170914061207.zxotvyopetm7lrrp%40alap3.anarazel.de
On 14 September 2017 at 07:34, Andres Freund <andres@anarazel.de> wrote:
Hi,
When running workloads that include fast queries with a lot of columns,
SendRowDescriptionMessage(), and the routines it calls, becomes a
bottleneck. Besides syscache lookups (see [1] and [2]) a major cost of
that is manipulation of the StringBuffer and the version specific
branches in the per-attribute loop. As so often, the performance
differential of this patch gets bigger when the other performance
patches are applied.The issues in SendRowDescriptionMessage() are the following:
1) All the pq_sendint calls, and also the pq_sendstring, are
expensive. The amount of calculations done for a single 2/4 byte
addition to the stringbuffer (particularly enlargeStringInfo()) is
problematic, as are the reallocations themselves.I addressed this by adding pq_send*_pre() wrappers, implemented as
inline functions, that require that memory is pre-allocated.
Combining that with doing a enlargeStringInfo() in
SendRowDescriptionMessage() that pre-allocates the maximum required
memory, yields pretty good speedup.I'm not yet super sure about the implementation. For one, I'm not
sure this shouldn't instead be stringinfo.h functions, with very very
tiny pqformat.h wrappers. But conversely I think it'd make a lot of
sense for the pqformat integer functions to get rid of the
continually maintained trailing null-byte - I was hoping the compiler
could optimize that away, but alas, no luck. As soon as a single
integer is sent, you can't rely on 0 terminated strings anyway.2) It creates a new StringInfo in every iteration. That causes
noticeable memory management overhead, and restarts the size of the
buffer every time. When the description is relatively large, that
leads to a number of reallocations for every
SendRowDescriptionMessage() call.My solution here was to create persistent StringInfo for
SendRowDescriptionMessage() that never gets deallocated (just
reset). That in combination with new versions of
pq_beginmessage/endmessage that keep the buffer alive, yields a nice
speedup.Currently I'm using a static variable to allocate a string buffer for
the function. It'd probably better to manage that outside of a single
function - I'm also wondering why we're allocating a good number of
stringinfos in various places, rather than reuse them. Most of them
can't be entered recursively, and even if that's a concern, we could
have one reusable per portal or such.3) The v2/v3 branches in the attribute loop are noticeable (others too,
but well...). I solved that by splitting out the v2 and v3
per-attribute loops into separate functions. Imo also a good chunk
more readable.Comments?
I've run a fairly basic test with a table with 101 columns, selecting
a single row from the table and I get the following results:
Columns with 1-character names:
master (80 jobs, 80 connections, 60 seconds):
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 559275
latency average = 8.596 ms
tps = 9306.984058 (including connections establishing)
tps = 11144.622096 (excluding connections establishing)
patched:
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 585526
latency average = 8.210 ms
tps = 9744.240847 (including connections establishing)
tps = 11454.339301 (excluding connections establishing)
master (80 jobs, 80 connections, 120 seconds):
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 1108312
latency average = 8.668 ms
tps = 9229.356994 (including connections establishing)
tps = 9987.385281 (excluding connections establishing)
patched:
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 1130313
latency average = 8.499 ms
tps = 9412.904876 (including connections establishing)
tps = 10319.404302 (excluding connections establishing)
Columns with at least 42-character names:
master (80 jobs, 80 connections, 60 seconds):
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 197815
latency average = 24.308 ms
tps = 3291.157486 (including connections establishing)
tps = 3825.309489 (excluding connections establishing)
patched:
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 198549
latency average = 24.214 ms
tps = 3303.896651 (including connections establishing)
tps = 3895.738024 (excluding connections establishing)
master (80 jobs, 80 connections, 120 seconds):
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 391312
latency average = 24.551 ms
tps = 3258.493026 (including connections establishing)
tps = 3497.581844 (excluding connections establishing)
patched:
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 391733
latency average = 24.526 ms
tps = 3261.805060 (including connections establishing)
tps = 3552.604408 (excluding connections establishing)
This is just using the patches you posted on this thread. Does this
exercise the patch in the way you intended?
Regards
Thom
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Thom,
Thanks for taking a whack at this!
On 2017-09-15 12:16:22 +0100, Thom Brown wrote:
I've run a fairly basic test with a table with 101 columns, selecting
a single row from the table and I get the following results:Columns with 1-character names:
master (80 jobs, 80 connections, 60 seconds):
FWIW, I don't think it's useful to test this with a lot of concurrency -
at that point you're likely saturating the machine with context switches
etc. unless you have a lot of cores. As this is isn't related to
concurrency I'd rather just check a single connection.
transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
I think you'd need to use prepared statements / -M prepared to see
benefits - when parsing statements for every execution the bottleneck is
elsewhere (hello O(#available_columns * #selected_columns) in
colNameToVar()).
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 15 September 2017 at 19:23, Andres Freund <andres@anarazel.de> wrote:
Hi Thom,
Thanks for taking a whack at this!
On 2017-09-15 12:16:22 +0100, Thom Brown wrote:
I've run a fairly basic test with a table with 101 columns, selecting
a single row from the table and I get the following results:Columns with 1-character names:
master (80 jobs, 80 connections, 60 seconds):
FWIW, I don't think it's useful to test this with a lot of concurrency -
at that point you're likely saturating the machine with context switches
etc. unless you have a lot of cores. As this is isn't related to
concurrency I'd rather just check a single connection.transaction type: /tmp/test.sql
scaling factor: 1
query mode: simpleI think you'd need to use prepared statements / -M prepared to see
benefits - when parsing statements for every execution the bottleneck is
elsewhere (hello O(#available_columns * #selected_columns) in
colNameToVar()).
Okay, I've retested it using a prepared statement executed 100,000
times (which selects a single row from a table with 101 columns, each
column is 42-43 characters long, and 2 rows in the table), and I get
the following:
master:
real 1m23.485s
user 1m2.800s
sys 0m1.200s
patched:
real 1m22.530s
user 1m2.860s
sys 0m1.140s
Not sure why I'm not seeing the gain.
Thom
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Thom,
On 2017-09-15 22:05:35 +0100, Thom Brown wrote:
Okay, I've retested it using a prepared statement executed 100,000
times (which selects a single row from a table with 101 columns, each
column is 42-43 characters long, and 2 rows in the table), and I get
the following:master:
real 1m23.485s
user 1m2.800s
sys 0m1.200spatched:
real 1m22.530s
user 1m2.860s
sys 0m1.140sNot sure why I'm not seeing the gain.
I suspect a part of that is that you're testing the patch in isolation,
whereas I tested it as part of multiple speedup patches. There's some
bigger bottlenecks than this one. I've attached my whole stack.
But even that being said, here's the result for these patches in
isolation on my machine:
setup:
psql -p 5440 -f ~/tmp/create_many_cols.sql
pgbench -M prepared -f ~/tmp/pgbench-many-cols.sql -n -T 10 -P 1
master (best of three):
tps = 13347.023301 (excluding connections establishing)
patched (best of three):
tps = 14309.690741 (excluding connections establishing)
Which is a bigger win than what you're observing. I've also attached the
benchmark scripts I used. Could you detail how your benchmark works a
bit more? Any chance you looped in plpgsql or such?
Just for fun/reference, these are the results with all the patches
applied:
tps = 19069.115553 (excluding connections establishing)
and with just this patch reverted:
tps = 17342.006825 (excluding connections establishing)
Regards,
Andres
Attachments:
0001-Speedup-pgstat_report_activity-by-moving-mb-aware-v1.patchtext/x-diff; charset=us-asciiDownload+72-21
0002-Add-more-efficient-functions-to-pqformat-APIv1.patchtext/x-diff; charset=us-asciiDownload+95-12
0003-Improve-performance-of-SendRowDescriptionMessagev1.patchtext/x-diff; charset=us-asciiDownload+118-39
0004-Add-inline-murmurhash32-int32-functionv1.patchtext/x-diff; charset=us-asciiDownload+20-19
0005-Replace-binary-search-in-fmgr_isbuiltin-with-hashtv1.patchtext/x-diff; charset=us-asciiDownload+47-17
0006-Add-pg_noinline-macro-to-c.hv1.patchtext/x-diff; charset=us-asciiDownload+16-1
0007-Improve-sys-catcache-performancev1.patchtext/x-diff; charset=us-asciiDownload+390-143
0008-WIP-Improve-getBaseTypeAndTypemod-performance-for-v1.patchtext/x-diff; charset=us-asciiDownload+3-1
On Sat, Sep 16, 2017 at 3:03 AM, Andres Freund <andres@anarazel.de> wrote:
Hi Thom,
On 2017-09-15 22:05:35 +0100, Thom Brown wrote:
Okay, I've retested it using a prepared statement executed 100,000
times (which selects a single row from a table with 101 columns, each
column is 42-43 characters long, and 2 rows in the table), and I get
the following:master:
real 1m23.485s
user 1m2.800s
sys 0m1.200spatched:
real 1m22.530s
user 1m2.860s
sys 0m1.140sNot sure why I'm not seeing the gain.
I suspect a part of that is that you're testing the patch in isolation,
whereas I tested it as part of multiple speedup patches. There's some
bigger bottlenecks than this one. I've attached my whole stack.But even that being said, here's the result for these patches in
isolation on my machine:
I have run the same test on Scylla for the single client. I have used
the same steps script as shared by you in above mail.
[mithun.cy@localhost ~]$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 56
On-line CPU(s) list: 0-55
Thread(s) per core: 2
Core(s) per socket: 14
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz
Stepping: 2
CPU MHz: 1200.761
BogoMIPS: 4594.35
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0-13,28-41
NUMA node1 CPU(s): 14-27,42-55
(result is median of 3 pgbench runs, each run 5 mins)
Base TPS:
========
12199.237133
With Patches TPS:
===============
(0002-Add-more-efficient-functions-to-pqformat-API.patch +
0003-Improve-performance-of-SendRowDescriptionMessage.patch)
13003.198407 (an improvement of 6.5%)
Perf report also says same.
Base: perf_bmany_cols
-------------------------------
19.94% 1.86% 11087 postgres postgres [.]
SendRowDescriptionMessage
|
---SendRowDescriptionMessage
|
|--99.91%-- exec_describe_portal_message
| PostgresMain
| ExitPostmaster
| BackendStartup
| ServerLoop
| PostmasterMain
| startup_hacks
| __libc_start_main
--0.09%-- [...]
After Patch: perf_many_cols
--------------------------------------
16.80% 0.04% 158 postgres postgres [.]
SendRowDescriptionMessage
|
---SendRowDescriptionMessage
|
|--99.89%-- exec_describe_portal_message
| PostgresMain
| ExitPostmaster
| BackendStartup
| ServerLoop
| PostmasterMain
| startup_hacks
| __libc_start_main
--0.11%-- [...]
So I think performance gain is visible. We saved a good amount of
execution cycle in SendRowDescriptionMessagewhen(my callgrind report
confirmed same) when we project a large number of columns in the query
with these new patches.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
Attachments:
perf_report.tar.gzapplication/x-gzip; name=perf_report.tar.gzDownload+4-18
On Mon, Sep 18, 2017 at 1:38 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
On Sat, Sep 16, 2017 at 3:03 AM, Andres Freund <andres@anarazel.de> wrote:
So I think performance gain is visible. We saved a good amount of
execution cycle in SendRowDescriptionMessagewhen(my callgrind report
confirmed same) when we project a large number of columns in the query
with these new patches.
I have tested patch, for me, patch looks good and can see improvement
in performance as a number of columns projected increases in the
query. There appear some cosmetic issues(pgindent issues + end of file
cr) in the patch if it can be considered as a valid issue they need
changes. Rest look okay for me.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2017-09-13 23:34:18 -0700, Andres Freund wrote:
I'm not yet super sure about the implementation. For one, I'm not
sure this shouldn't instead be stringinfo.h functions, with very very
tiny pqformat.h wrappers. But conversely I think it'd make a lot of
sense for the pqformat integer functions to get rid of the
continually maintained trailing null-byte - I was hoping the compiler
could optimize that away, but alas, no luck. As soon as a single
integer is sent, you can't rely on 0 terminated strings anyway.
I'd been wondering about missing CPU optimizations after the patch, and
hunted it down. Turns out the problem is that htons/ntohs are, on pretty
much all glibc versions, implemented using inline assembler. Which in
turns allows the compiler very little freedom to perform optimizations,
because it doesn't know what's actually happening.
Attached is an extension of the already existing pg_bswap.h that
a) adds 16 bit support
b) moves everything to inline functions, removing multiple evaluation
hazards that were present everywhere.
c) adds pg_nto{s,l,ll} and pg_hton{s,l,ll} wrappers that only do work
if necessary.
This'll allow the later patches to allow the compiler to perform the
relevant optimizations. It also allows to optimize e.g. pq_sendint64()
to avoid having to do multiple byteswaps.
Greetings,
Andres Freund
Attachments:
0001-Extend-revamp-pg_bswap.h-infrastructure.patchtext/x-diff; charset=us-asciiDownload+128-28
On Thu, Sep 28, 2017 at 2:20 AM, Andres Freund <andres@anarazel.de> wrote:
This'll allow the later patches to allow the compiler to perform the
relevant optimizations. It also allows to optimize e.g. pq_sendint64()
to avoid having to do multiple byteswaps.
I guess that you could clean up the 8-byte duplicate implementations
in pg_rewind's libpq_fetch.c (pg_recvint64) and in pg_basebackup's
streamutil.c (fe_recvint64) at the same time, right?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
Attached is an extension of the already existing pg_bswap.h that
a) adds 16 bit support
b) moves everything to inline functions, removing multiple evaluation
hazards that were present everywhere.
c) adds pg_nto{s,l,ll} and pg_hton{s,l,ll} wrappers that only do work
if necessary.
Could we please not perpetuate the brain-dead "s" and "l" suffixes
on these names? Given the lack of standardization as to how long
"long" is, that's entirely unhelpful. I'd be fine with names like
pg_ntoh16/32/64 and pg_hton16/32/64.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 00:01:53 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
Attached is an extension of the already existing pg_bswap.h that
a) adds 16 bit support
b) moves everything to inline functions, removing multiple evaluation
hazards that were present everywhere.
c) adds pg_nto{s,l,ll} and pg_hton{s,l,ll} wrappers that only do work
if necessary.Could we please not perpetuate the brain-dead "s" and "l" suffixes
on these names? Given the lack of standardization as to how long
"long" is, that's entirely unhelpful. I'd be fine with names like
pg_ntoh16/32/64 and pg_hton16/32/64.
Yes. I'd polled a few people and they leaned towards those. But I'm
perfectly happy to do that renaming.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On September 27, 2017 9:06:49 PM PDT, Andres Freund <andres@anarazel.de> wrote:
On 2017-09-28 00:01:53 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
Attached is an extension of the already existing pg_bswap.h that
a) adds 16 bit support
b) moves everything to inline functions, removing multipleevaluation
hazards that were present everywhere.
c) adds pg_nto{s,l,ll} and pg_hton{s,l,ll} wrappers that only dowork
if necessary.
Could we please not perpetuate the brain-dead "s" and "l" suffixes
on these names? Given the lack of standardization as to how long
"long" is, that's entirely unhelpful. I'd be fine with names like
pg_ntoh16/32/64 and pg_hton16/32/64.Yes. I'd polled a few people and they leaned towards those. But I'm
perfectly happy to do that renaming.
If somebody wants to argue for replacing hton/ntoh with {to,from}big or *be, now's the time.
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 28, 2017 at 1:31 PM, Andres Freund <andres@anarazel.de> wrote:
On September 27, 2017 9:06:49 PM PDT, Andres Freund <andres@anarazel.de> wrote:
On 2017-09-28 00:01:53 -0400, Tom Lane wrote:
Could we please not perpetuate the brain-dead "s" and "l" suffixes
on these names? Given the lack of standardization as to how long
"long" is, that's entirely unhelpful. I'd be fine with names like
pg_ntoh16/32/64 and pg_hton16/32/64.Yes. I'd polled a few people and they leaned towards those. But I'm
perfectly happy to do that renaming.If somebody wants to argue for replacing hton/ntoh with {to,from}big or *be, now's the time.
OK. pg_hton16/32/64 and pg_ntoh16/32/64 are fine enough IMO.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/27/2017 10:50 PM, Andres Freund wrote:
This'll allow the later patches to allow the compiler to perform the
relevant optimizations. It also allows to optimize e.g. pq_sendint64()
to avoid having to do multiple byteswaps.
After applying all the required patches, able to see some performance gain
Virtual Machine configuration - Centos 6.5 x64 / 16 GB RAM / 8 VCPU core
processor
./pgbench -M prepared -j 10 -c 10 -f /tmp/pgbench-many-cols.sql postgres
-T TIME
After taking Median of 3 run� -
Case 1 � TIME=300
PG HEAD =>41285.089261 (excluding connections establishing)
PG HEAD+patch =>tps= 42446.626947(2.81+% vs. head)
Case 2- TIME=500
PG HEAD =>tps = 41252.897670 (excluding connections establishing)
PG HEAD+patch =>tps= 42257.439550(2.43+% vs. head)
Case 3- TIME=1000
PG HEAD =>tps = 1061.031463 (excluding connections establishing)
PG HEAD+patch => tps= 8011.784839(3.30+% vs. head)
Case 4-TIME=1500
PG HEAD =>tps = 40365.099628 (excluding connections establishing)
PG HEAD+patch =>tps= 42385.372848(5.00+% vs. head)
--
regards,tushar
EnterpriseDB https://www.enterprisedb.com/
The Enterprise PostgreSQL Company
On Fri, Sep 29, 2017 at 5:02 AM, tushar <tushar.ahuja@enterprisedb.com> wrote:
Case 3- TIME=1000
PG HEAD =>tps = 1061.031463 (excluding connections establishing)
PG HEAD+patch => tps= 8011.784839(3.30+% vs. head)
Going from 1061 tps to 8011 tps is not a 3.3% gain. I assume you
garbled this output somehow.
Also note that you really mean +3.30% not 3.30+%.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 14:23:45 +0900, Michael Paquier wrote:
On Thu, Sep 28, 2017 at 1:31 PM, Andres Freund <andres@anarazel.de> wrote:
On September 27, 2017 9:06:49 PM PDT, Andres Freund <andres@anarazel.de> wrote:
On 2017-09-28 00:01:53 -0400, Tom Lane wrote:
Could we please not perpetuate the brain-dead "s" and "l" suffixes
on these names? Given the lack of standardization as to how long
"long" is, that's entirely unhelpful. I'd be fine with names like
pg_ntoh16/32/64 and pg_hton16/32/64.Yes. I'd polled a few people and they leaned towards those. But I'm
perfectly happy to do that renaming.If somebody wants to argue for replacing hton/ntoh with {to,from}big or *be, now's the time.
OK. pg_hton16/32/64 and pg_ntoh16/32/64 are fine enough IMO.
Does anybody have an opinion on whether we'll want to convert examples
like testlibpq3.c (included in libpq.sgml) too? I'm inclined not to,
because currently using pg_bswap.h requires c.h presence (just for a few
typedefs and configure data). There's also not really a pressing need.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
Does anybody have an opinion on whether we'll want to convert examples
like testlibpq3.c (included in libpq.sgml) too? I'm inclined not to,
because currently using pg_bswap.h requires c.h presence (just for a few
typedefs and configure data). There's also not really a pressing need.
We certainly mustn't encourage libpq users to start depending on c.h,
so let's leave that alone.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-29 17:56:10 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
Does anybody have an opinion on whether we'll want to convert examples
like testlibpq3.c (included in libpq.sgml) too? I'm inclined not to,
because currently using pg_bswap.h requires c.h presence (just for a few
typedefs and configure data). There's also not really a pressing need.We certainly mustn't encourage libpq users to start depending on c.h,
so let's leave that alone.
Here's two patches:
0001: Previously submitted changes to pg_bswap.h, addressing concerns
like the renaming
0002: Move over most users of ntoh[sl]/hton[sl] over to pg_bswap.h.
Note that the latter patch includes replacing open-coded byte swapping
of 64bit integers (using two 32 bit swaps) with a single 64bit
swap. I've also removed pg_recvint64 - it's now a single pg_ntoh64 - as
it's name strikes me as misleading.
Where it looked applicable I have removed netinet/in.h and arpa/inet.h
usage, which previously provided the relevant functionality. It's
perfectly possible that I missed other reasons for including those,
the buildfarm will tell.
Greetings,
Andres Freund
Hi,
Attached is a revised version of this patchset. I'd like to get some
input on two points:
1) Does anybody have a better idea than the static buffer in
SendRowDescriptionMessage()? That's not particularly pretty, but
there's not really a convenient stringbuffer to use when called from
exec_describe_portal_message(). We could instead create a local
buffer for exec_describe_portal_message().
An alternative idea would be to have one reeusable buffer created for
each transaction command, but I'm not sure that's really better.
2) There's a lot of remaining pq_sendint() callers in other parts of the
tree. If others are ok with that, I'd do a separate pass over them.
I'd say that even after doing that, we should keep pq_sendint(),
because a lot of extension code is using that.
3) The use of restrict, with a configure based fallback, is something
we've not done before, but it's C99 and delivers significantly more
efficient code. Any arguments against?
Regards,
Andres
Attachments:
0001-Add-configure-infrastructure-to-detect-support-forv2.patchtext/x-diff; charset=us-asciiDownload+67-1
0002-Allow-to-avoid-NUL-byte-management-for-stringinfosv2.patchtext/x-diff; charset=us-asciiDownload+37-11
0003-Add-more-efficient-functions-to-pqformat-APIv2.patchtext/x-diff; charset=us-asciiDownload+195-69
0004-Use-one-stringbuffer-for-all-rows-printed-in-printv2.patchtext/x-diff; charset=us-asciiDownload+11-9
0005-Improve-performance-of-SendRowDescriptionMessagev2.patchtext/x-diff; charset=us-asciiDownload+121-39
0006-Replace-remaining-printtup-uses-of-pq_sendint-withv2.patchtext/x-diff; charset=us-asciiDownload+8-9
On Tue, Oct 3, 2017 at 3:55 AM, Andres Freund <andres@anarazel.de> wrote:
Attached is a revised version of this patchset.
I don't much like the functions with "_pre" affixed to their names.
It's not at all clear that "pre" means "preallocated"; it sounds more
like you're doing something ahead of time. I wonder about maybe
calling these e.g. pq_writeint16, with "write" meaning "assume
preallocation" and "send" meaning "don't assume preallocation". There
could be other ideas, too.
I'd like to get some
input on two points:1) Does anybody have a better idea than the static buffer in
SendRowDescriptionMessage()? That's not particularly pretty, but
there's not really a convenient stringbuffer to use when called from
exec_describe_portal_message(). We could instead create a local
buffer for exec_describe_portal_message().An alternative idea would be to have one reeusable buffer created for
each transaction command, but I'm not sure that's really better.
I don't have a better idea.
2) There's a lot of remaining pq_sendint() callers in other parts of the
tree. If others are ok with that, I'd do a separate pass over them.
I'd say that even after doing that, we should keep pq_sendint(),
because a lot of extension code is using that.
I think we should change everything to the new style and I wouldn't
object to removing pq_sendint() either. However, if we want to keep
it with a note that only extension code should use it, that's OK with
me, too.
3) The use of restrict, with a configure based fallback, is something
we've not done before, but it's C99 and delivers significantly more
efficient code. Any arguments against?
It's pretty unobvious why it helps here. I think you should add
comments. Also, unless I'm missing something, there's nothing to keep
pq_sendintXX_pre from causing an alignment fault except unbridled
optimism...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers