[PATCH] Better Performance for PostgreSQL with large INSERTs

Started by Philipp Marek4 months ago13 messages

philipp@marek.priv.at

4 months ago

Sometimes, storing documents (eg. PDFs) in a database
is much easier than using a separate storage (like S3, NFS, etc.).

(Because of issues like backup integrity, availability,
service dependencies, access rights, encryption of data, etc..)

With this patch:

```diff
diff --git i/src/backend/libpq/pqcomm.c w/src/backend/libpq/pqcomm.c
index e517146..936b073 100644
--- i/src/backend/libpq/pqcomm.c
+++ w/src/backend/libpq/pqcomm.c
@@ -117,7 +117,8 @@ static List *sock_paths = NIL;
   */

  #define PQ_SEND_BUFFER_SIZE 8192
-#define PQ_RECV_BUFFER_SIZE 8192
+#define PQ_RECV_BUFFER_SIZE 2097152
+

static char *PqSendBuffer;
static int PqSendBufferSize; /* Size send buffer */
```

ie. changing the network receive buffer size from 8KB to 2MB,
got 7% better INSERT performance when storing BLOBs.

The 2MB value is just what we tried, 128kB or 256kB works as well.
The main point is to reduce the number of syscalls for receiving data
to about half of what it is with 8KB.

Thank you for your consideration!

Regards,

Phil

Kirill Reshke

reshkekirill@gmail.com

4 months ago

In reply to: Philipp Marek (#1)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi!

On Mon, 15 Sept 2025 at 18:16, Philipp Marek <philipp@marek.priv.at> wrote:

Sometimes, storing documents (eg. PDFs) in a database
is much easier than using a separate storage (like S3, NFS, etc.).

(Because of issues like backup integrity, availability,
service dependencies, access rights, encryption of data, etc..)

With this patch:
```diff
diff --git i/src/backend/libpq/pqcomm.c w/src/backend/libpq/pqcomm.c
index e517146..936b073 100644
--- i/src/backend/libpq/pqcomm.c
+++ w/src/backend/libpq/pqcomm.c
@@ -117,7 +117,8 @@ static List *sock_paths = NIL;
*/
#define PQ_SEND_BUFFER_SIZE 8192
-#define PQ_RECV_BUFFER_SIZE 8192
+#define PQ_RECV_BUFFER_SIZE 2097152
+

Changing this constant will result in an overwhelming increase of
memory consumption for instances that work with a large number of
connections (
max_connections ~ 1e4) for zero benefit.

--
Best regards,
Kirill Reshke

Philipp Marek

philipp@marek.priv.at

4 months ago

In reply to: Kirill Reshke (#2)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi Kirill,

Changing this constant will result in an overwhelming increase of
memory consumption for instances that work with a large number of
connections (
max_connections ~ 1e4) for zero benefit.

No, it's not that bad.

1) With this being allocated in the bss segment,
the RAM will only be actually be provided on _first use_ --
so processes that only ever work with small queries will see no
difference.

2) With temp_buffers' default of 8MB, work_mem using 4MB, etc.,
using an additional 256kB RAM to 2MB _when needed_ is a good deal
for about 7% performance improvements.

Philipp Marek

philipp@marek.priv.at

3 months ago

In reply to: Philipp Marek (#3)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Here's the patch again, this time with a 128kB buffer size.

This gives us nearly the same gains (~7%) for the blob INSERTs,
and the additional memory usage (120kB) shouldn't really matter,
with "temp_buffer"s 8MB and "work_mem" 4MB defaults.

Making it configurable would give a much more complex patch --
so I suggest just using this fixed size.

Thanks!

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 25f739a6..72d092b2 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -117,7 +117,7 @@ static List *sock_paths = NIL;
   */

  #define PQ_SEND_BUFFER_SIZE 8192
-#define PQ_RECV_BUFFER_SIZE 8192
+#define PQ_RECV_BUFFER_SIZE (128 * 1024)

static char *PqSendBuffer;
static int PqSendBufferSize; /* Size send buffer */

Filip Janus

fjanus@redhat.com

3 months ago

In reply to: Philipp Marek (#4)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi,
7% is an interesting improvement for these cases. I see it as a nice
feature,
What about adding a GUC variable to have the ability to configure
PQ_RECV_BUFFER_SIZE based on the user's needs?

-Filip-

út 30. 9. 2025 v 10:42 odesílatel Philipp Marek <philipp@marek.priv.at>
napsal:

Show quoted text

Here's the patch again, this time with a 128kB buffer size.

This gives us nearly the same gains (~7%) for the blob INSERTs,
and the additional memory usage (120kB) shouldn't really matter,
with "temp_buffer"s 8MB and "work_mem" 4MB defaults.

Making it configurable would give a much more complex patch --
so I suggest just using this fixed size.

Thanks!
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 25f739a6..72d092b2 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -117,7 +117,7 @@ static List *sock_paths = NIL;
*/
#define PQ_SEND_BUFFER_SIZE 8192
-#define PQ_RECV_BUFFER_SIZE 8192
+#define PQ_RECV_BUFFER_SIZE (128 * 1024)
static char *PqSendBuffer;
static int PqSendBufferSize; /* Size send buffer */

Andres Freund

andres@anarazel.de

3 months ago

In reply to: Philipp Marek (#4)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi,

On 2025-09-30 10:42:00 +0200, Philipp Marek wrote:

Here's the patch again, this time with a 128kB buffer size.

This gives us nearly the same gains (~7%) for the blob INSERTs,
and the additional memory usage (120kB) shouldn't really matter,
with "temp_buffer"s 8MB and "work_mem" 4MB defaults.

Making it configurable would give a much more complex patch --
so I suggest just using this fixed size.

Have you tried to verify that this doesn't cause performance regressions in
other workloads? pq_recvbuf() has this code:

if (PqRecvPointer > 0)
{
if (PqRecvLength > PqRecvPointer)
{
/* still some unread data, left-justify it in the buffer */
memmove(PqRecvBuffer, PqRecvBuffer + PqRecvPointer,
PqRecvLength - PqRecvPointer);
PqRecvLength -= PqRecvPointer;
PqRecvPointer = 0;
}
else
PqRecvLength = PqRecvPointer = 0;
}

I do seem to recall that just increasing the buffer size substantially lead to
more time being spent inside that memmove() (likely due to exceeding L1/L2).

Greetings,

Andres Freund

Philipp Marek

philipp@marek.priv.at

3 months ago

In reply to: Filip Janus (#5)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi Filip,

7% is an interesting improvement for these cases. I see it as a nice
feature,
What about adding a GUC variable to have the ability to configure
PQ_RECV_BUFFER_SIZE based on the user's needs?

Well, currently this is just the array size as a preprocessor macro,
so the executables .bss gets sized differently.

For a GUC quite some more code, including dynamical allocation
_and reallocation_ would be needed -- and this complexity I wanted
to avoid for the first discussion.

If a GUC fits the usage patterns better, I'm okay with it!

Philipp Marek

philipp@marek.priv.at

3 months ago

In reply to: Andres Freund (#6)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi Andres,

Have you tried to verify that this doesn't cause performance
regressions in
other workloads? pq_recvbuf() has this code:

...

I do seem to recall that just increasing the buffer size substantially
lead to
more time being spent inside that memmove() (likely due to exceeding
L1/L2).

Do you have any pointers to discussions or other data about that?

My (quick) analysis was that clients that send one request,
wait for an answer, then send the next request wouldn't run that code
as there's nothing behind the individual requests that could be moved.

But yes, Pipeline Mode[1] might/would be affected.

The interesting question is how much data can userspace copy before
that means more load than doing a userspace-kernel-userspace round trip.
(I guess that moving 64kB or 128kB should be quicker, especially since
the various CPU mitigations.)

As long as there are complete requests in the buffer the memmove()
could be avoided; only the initial part of the first incomplete request
might need moving to the beginning.

That patch would be more than +- 1 line again ;)

The documentation says

Pipelining is less useful, and more complex,
when a single pipeline contains multiple transactions
(see Section 32.5.1.3).

are there any benchmarks/usage statistics for pipeline mode?

Regards,

Phil

Ad 1: https://www.postgresql.org/docs/current/libpq-pipeline-mode.html

Andres Freund

andres@anarazel.de

3 months ago

In reply to: Philipp Marek (#8)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi,

On 2025-10-07 15:03:29 +0200, Philipp Marek wrote:

Have you tried to verify that this doesn't cause performance regressions
in
other workloads? pq_recvbuf() has this code:

...

I do seem to recall that just increasing the buffer size substantially
lead to
more time being spent inside that memmove() (likely due to exceeding
L1/L2).

Do you have any pointers to discussions or other data about that?

My (quick) analysis was that clients that send one request,
wait for an answer, then send the next request wouldn't run that code
as there's nothing behind the individual requests that could be moved.

But yes, Pipeline Mode[1] might/would be affected.

The interesting question is how much data can userspace copy before
that means more load than doing a userspace-kernel-userspace round trip.
(I guess that moving 64kB or 128kB should be quicker, especially since
the various CPU mitigations.)

I unfortunately don't remember the details of where I saw it
happening. Unfortunately I suspect it'll depend a lot on hardware and
operating system details (like the security mitigations you mention) when it
matters too.

As long as there are complete requests in the buffer the memmove()
could be avoided; only the initial part of the first incomplete request
might need moving to the beginning.

Right. I'd be inclined that that ought to be addressed as part of this patch,
that way we can be sure that it's pretty sure it's not going to cause
regressions.

The documentation says

Pipelining is less useful, and more complex,
when a single pipeline contains multiple transactions
(see Section 32.5.1.3).

are there any benchmarks/usage statistics for pipeline mode?

You can write benchmarks for it using pgbench's pipeline support, with a
custom script.

Greetings,

Andres Freund

#10

Filip Janus

fjanus@redhat.com

about 2 months ago

In reply to: Andres Freund (#9)

2 attachment(s)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

-Filip-

út 7. 10. 2025 v 16:54 odesílatel Andres Freund <andres@anarazel.de> napsal:

Hi,

On 2025-10-07 15:03:29 +0200, Philipp Marek wrote:

Have you tried to verify that this doesn't cause performance

regressions

in
other workloads? pq_recvbuf() has this code:

...

I do seem to recall that just increasing the buffer size substantially
lead to
more time being spent inside that memmove() (likely due to exceeding
L1/L2).

Do you have any pointers to discussions or other data about that?

My (quick) analysis was that clients that send one request,
wait for an answer, then send the next request wouldn't run that code
as there's nothing behind the individual requests that could be moved.

But yes, Pipeline Mode[1] might/would be affected.

The interesting question is how much data can userspace copy before
that means more load than doing a userspace-kernel-userspace round trip.
(I guess that moving 64kB or 128kB should be quicker, especially since
the various CPU mitigations.)

I unfortunately don't remember the details of where I saw it
happening. Unfortunately I suspect it'll depend a lot on hardware and
operating system details (like the security mitigations you mention) when
it
matters too.

As long as there are complete requests in the buffer the memmove()
could be avoided; only the initial part of the first incomplete request
might need moving to the beginning.

Right. I'd be inclined that that ought to be addressed as part of this
patch,
that way we can be sure that it's pretty sure it's not going to cause
regressions.

I tried to benchmark the usage of memmove(), but I wasn’t able to hit the
memmove() part of the code. This led me to a deeper investigation, and I
realized that the memmove() call is probably in a dead part of the code.
pq_recvbuf is called when PqRecvPointer >= PqRecvLength, while memmove() is
called later only if PqRecvLength > PqRecvPointer.
This results in a contradiction.

The documentation says

Pipelining is less useful, and more complex,
when a single pipeline contains multiple transactions
(see Section 32.5.1.3).

are there any benchmarks/usage statistics for pipeline mode?

You can write benchmarks for it using pgbench's pipeline support, with a
custom script.

Greetings,

Andres Freund

I am also proposing the introduction of a new GUC variable for setting PQ_RECV_BUFFER_SIZE

in the first patch. And the second patch removes the dead code.

Filip

Attachments:

0002-Remove-dead-memmove-code-from-pq_recvbuf.patchapplication/octet-stream; name=0002-Remove-dead-memmove-code-from-pq_recvbuf.patchDownload

From b93a253e1e10ee0b3f20d3dc89e0300b477c5555 Mon Sep 17 00:00:00 2001
From: Filip Janus <fjanus@redhat.com>
Date: Wed, 26 Nov 2025 15:00:11 +0100
Subject: [PATCH 2/2] Remove dead memmove code from pq_recvbuf()

The memmove block in pq_recvbuf() is unreachable dead code that has
never been executed in practice.

Analysis:
pq_recvbuf() is only called from pq_getbytes() when the condition
PqRecvPointer >= PqRecvLength is true (buffer is empty or exhausted).

However, the memmove is only executed when PqRecvLength > PqRecvPointer
(buffer contains unread data).

These conditions are mutually exclusive:
  - Caller requires: PqRecvPointer >= PqRecvLength
  - memmove requires: PqRecvLength > PqRecvPointer
---
 src/backend/libpq/pqcomm.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index d4bd412b68d..f1c3601be66 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -902,15 +902,6 @@ pq_recvbuf(void)
 {
 	if (PqRecvPointer > 0)
 	{
-		if (PqRecvLength > PqRecvPointer)
-		{
-			/* still some unread data, left-justify it in the buffer */
-			memmove(PqRecvBuffer, PqRecvBuffer + PqRecvPointer,
-					PqRecvLength - PqRecvPointer);
-			PqRecvLength -= PqRecvPointer;
-			PqRecvPointer = 0;
-		}
-		else
 			PqRecvLength = PqRecvPointer = 0;
 	}
 
-- 
2.39.5 (Apple Git-154)

0001-Add-configurable-receive-buffer-size.patchapplication/octet-stream; name=0001-Add-configurable-receive-buffer-size.patchDownload

From de5cd5744012a1ec1b20b16863ce24f45f30aac6 Mon Sep 17 00:00:00 2001
From: Filip Janus <fjanus@redhat.com>
Date: Wed, 12 Nov 2025 14:29:43 +0100
Subject: [PATCH 1/2] Add configurable receive buffer size

This commit introduces a new GUC parameter 'pq_recv_buffer_size' to allow
users to configure the size of the network receive buffer for each backend
connection. The receive buffer is now dynamically allocated instead of
using a fixed compile-time size.

Key changes:
- Make PqRecvBuffer dynamically allocated based on pq_recv_buffer_size GUC
- Add pq_recv_buffer_size GUC parameter (default 8KB, min 8KB, max configurable)
  has enough space

Benefits:
- Configurable buffer size allows tuning for workloads with large messages
- Maintains backward compatibility with 8KB default
---
 src/backend/libpq/pqcomm.c                    | 14 +++++++++-----
 src/backend/utils/init/globals.c              |  3 +++
 src/backend/utils/misc/guc_parameters.dat     | 10 ++++++++++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/miscadmin.h                       |  1 +
 5 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 25f739a6a17..d4bd412b68d 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -112,19 +112,21 @@ static List *sock_paths = NIL;
 /*
  * Buffers for low-level I/O.
  *
- * The receive buffer is fixed size. Send buffer is usually 8k, but can be
- * enlarged by pq_putmessage_noblock() if the message doesn't fit otherwise.
+ * Both send and receive buffers are dynamically allocated. Send buffer is
+ * usually 8k, but can be enlarged by pq_putmessage_noblock() if the message
+ * doesn't fit otherwise. Receive buffer size is configurable via the
+ * pq_recv_buffer_size GUC parameter.
  */
 
 #define PQ_SEND_BUFFER_SIZE 8192
-#define PQ_RECV_BUFFER_SIZE 8192
 
 static char *PqSendBuffer;
 static int	PqSendBufferSize;	/* Size send buffer */
 static size_t PqSendPointer;	/* Next index to store a byte in PqSendBuffer */
 static size_t PqSendStart;		/* Next index to send a byte in PqSendBuffer */
 
-static char PqRecvBuffer[PQ_RECV_BUFFER_SIZE];
+static char *PqRecvBuffer;		/* Dynamically allocated receive buffer */
+static int	PqRecvBufferSize;	/* Size of receive buffer */
 static int	PqRecvPointer;		/* Next index to read a byte from PqRecvBuffer */
 static int	PqRecvLength;		/* End of data available in PqRecvBuffer */
 
@@ -278,6 +280,8 @@ pq_init(ClientSocket *client_sock)
 	/* initialize state variables */
 	PqSendBufferSize = PQ_SEND_BUFFER_SIZE;
 	PqSendBuffer = MemoryContextAlloc(TopMemoryContext, PqSendBufferSize);
+	PqRecvBufferSize = pq_recv_buffer_size * 1024;	/* GUC is in KB */
+	PqRecvBuffer = MemoryContextAlloc(TopMemoryContext, PqRecvBufferSize);
 	PqSendPointer = PqSendStart = PqRecvPointer = PqRecvLength = 0;
 	PqCommBusy = false;
 	PqCommReadingMsg = false;
@@ -921,7 +925,7 @@ pq_recvbuf(void)
 		errno = 0;
 
 		r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength,
-						PQ_RECV_BUFFER_SIZE - PqRecvLength);
+						PqRecvBufferSize - PqRecvLength);
 
 		if (r < 0)
 		{
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..5fdf1293d2e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -165,3 +165,6 @@ int			notify_buffers = 16;
 int			serializable_buffers = 32;
 int			subtransaction_buffers = 0;
 int			transaction_buffers = 0;
+
+/* network buffer sizes */
+int			pq_recv_buffer_size = 8;
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..51816d31807 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2259,6 +2259,16 @@
   max => 'INT_MAX / 1000000',
 },
 
+{ name => 'pq_recv_buffer_size', type => 'int', context => 'PGC_BACKEND', group => 'RESOURCES_MEM',
+  short_desc => 'Sets the size of the network receive buffer for each backend.',
+  long_desc => 'Larger values can improve performance when receiving large messages, but consume more memory per connection.',
+  flags => 'GUC_UNIT_KB',
+  variable => 'pq_recv_buffer_size',
+  boot_val => '8',
+  min => '8',
+  max => 'MAX_KILOBYTES',
+},
+
 # Not for general use
 { name => 'pre_auth_delay', type => 'int', context => 'PGC_SIGHUP', group => 'DEVELOPER_OPTIONS',
   short_desc => 'Sets the amount of time to wait before authentication on connection startup.',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..4bb56b4cbb0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -136,6 +136,7 @@
 #huge_page_size = 0                     # zero for system default
                                         # (change requires restart)
 #temp_buffers = 8MB                     # min 800kB
+#pq_recv_buffer_size = 8kB              # min 8kB
 #max_prepared_transactions = 0          # zero disables the feature
                                         # (change requires restart)
 # Caution: it is not advisable to set max_prepared_transactions nonzero unless
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..2e51235ee6a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -186,6 +186,7 @@ extern PGDLLIMPORT int notify_buffers;
 extern PGDLLIMPORT int serializable_buffers;
 extern PGDLLIMPORT int subtransaction_buffers;
 extern PGDLLIMPORT int transaction_buffers;
+extern PGDLLIMPORT int pq_recv_buffer_size;
 
 extern PGDLLIMPORT int MyProcPid;
 extern PGDLLIMPORT pg_time_t MyStartTime;
-- 
2.39.5 (Apple Git-154)

#11

Philipp Marek

philipp@marek.priv.at

about 1 month ago

In reply to: Filip Janus (#10)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi Filip,

I am also proposing the introduction of a new GUC
variable for setting PQ_RECV_BUFFER_SIZE
in the first patch.

thanks a lot; this allows configuration for all
connections to a database, and so should fit our needs
while not changing any behaviour in the default case.

And the second patch removes the dead code.

Also always a good idea ;)

Thank you!

Regards,

Phil

#12

Jakub Wartak

jakub.wartak@enterprisedb.com

about 1 month ago

In reply to: Filip Janus (#10)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

On Wed, Nov 26, 2025 at 3:03 PM Filip Janus <fjanus@redhat.com> wrote:

-Filip-

út 7. 10. 2025 v 16:54 odesílatel Andres Freund <andres@anarazel.de> napsal:

Hi,

On 2025-10-07 15:03:29 +0200, Philipp Marek wrote:

Have you tried to verify that this doesn't cause performance regressions
in
other workloads? pq_recvbuf() has this code:

...

I do seem to recall that just increasing the buffer size substantially
lead to
more time being spent inside that memmove() (likely due to exceeding
L1/L2).

Do you have any pointers to discussions or other data about that?

My (quick) analysis was that clients that send one request,
wait for an answer, then send the next request wouldn't run that code
as there's nothing behind the individual requests that could be moved.

But yes, Pipeline Mode[1] might/would be affected.

The interesting question is how much data can userspace copy before
that means more load than doing a userspace-kernel-userspace round trip.
(I guess that moving 64kB or 128kB should be quicker, especially since
the various CPU mitigations.)

I unfortunately don't remember the details of where I saw it
happening. Unfortunately I suspect it'll depend a lot on hardware and
operating system details (like the security mitigations you mention) when it
matters too.

As long as there are complete requests in the buffer the memmove()
could be avoided; only the initial part of the first incomplete request
might need moving to the beginning.

Right. I'd be inclined that that ought to be addressed as part of this patch,
that way we can be sure that it's pretty sure it's not going to cause
regressions.

I tried to benchmark the usage of memmove(), but I wasn’t able to hit the memmove() part of the code. This led me to a deeper investigation, and I realized that the memmove() call is probably in a dead part of the code.
pq_recvbuf is called when PqRecvPointer >= PqRecvLength, while memmove() is called later only if PqRecvLength > PqRecvPointer.
This results in a contradiction.

The documentation says

Pipelining is less useful, and more complex,
when a single pipeline contains multiple transactions
(see Section 32.5.1.3).

are there any benchmarks/usage statistics for pipeline mode?

You can write benchmarks for it using pgbench's pipeline support, with a
custom script.

Greetings,

Andres Freund

I am also proposing the introduction of a new GUC variable for setting PQ_RECV_BUFFER_SIZE in the first patch. And the second patch removes the dead code.

Hi Filip,

Can you please how have you verified it is giving you that some perf. increase?

3 tries each, best:

@ pq_recv_buffers = 2MB best of 3:
latency average = 2.594 ms
latency stddev = 0.352 ms
initial connection time = 9.419 ms
tps = 385.431723 (without initial connection time)

@ pq_recv_buffers = default (8kB) best of 3:
latency average = 2.629 ms
latency stddev = 0.929 ms
initial connection time = 9.937 ms
tps = 380.336257 (without initial connection time)

/usr/pgsql19/bin/pgbench -h xxx -U app -f insert.sql -c 1 -P 1 -n -T
5 -M prepared postgres
where insert.sql was:
echo "CREATE TEMPORARY TABLE IF NOT EXISTS file_storage (data
BYTEA STORAGE EXTERNAL) ON COMMIT PRESERVE ROWS;" > insert.sql
echo "INSERT INTO file_storage(data) VALUES ('" >> insert.sql
perl -e 'print "A"x(1024*1024);' >> insert.sql # 1MB
echo "');" >> insert.sql

Some description of the env I had:
- tuned TCP rmem/wmem and set congestion to BBR (to eliminate TCP as
being a bottlneck)
- low RTT (same AWS zone), max 25Gbps total, max 9.6 Gbps single stream TCP
- as TOAST pglz compression way taking most of CPU in my case , I've
changed it to lz4 also didn't help a lot, so I've changed it to avoid
*any* compression
- switched to temporary table to avoid I/O as much as possible,
wal_level=minimal too
- had to use prepared statements as otherwise I was hitting way too
much CPU in parser (yylex routines)

So I'm looking for a way to demonstrate the effect. I've also written
a simple psypong2 based LO upload benchmark as pgbench cannot
apparently benchmark this. Sadly of course, then you cannot I think
disable compression and/or load into TEMPORARY table so it's far worse
and hits I/O heavy (as it hit pg_largeobjects*)

-J.

#13

Philipp Marek

philipp@marek.priv.at

30 days ago

In reply to: Jakub Wartak (#12)

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

Hi Jakub,

Can you please how have you verified it is giving you that some perf.
increase?

...

where insert.sql was:
echo "CREATE TEMPORARY TABLE IF NOT EXISTS file_storage (data
BYTEA STORAGE EXTERNAL) ON COMMIT PRESERVE ROWS;" > insert.sql
echo "INSERT INTO file_storage(data) VALUES ('" >> insert.sql
perl -e 'print "A"x(1024*1024);' >> insert.sql # 1MB
echo "');" >> insert.sql

This data might be too easily compressed.
In production we upload files (PDFs), for benchmarking
we generated random data.

This might make a big difference on the network side
if you're using TLS with compression, for example.

Some description of the env I had:
- tuned TCP rmem/wmem and set congestion to BBR (to eliminate TCP as
being a bottlneck)

Well, if your tcp_rmem buffers are bigger than the data you upload,
that might also soften the impact of the patch.

- low RTT (same AWS zone), max 25Gbps total, max 9.6 Gbps single stream
TCP

That might also be a big difference,
we had a few msec inbetween client and server.

- as TOAST pglz compression way taking most of CPU in my case , I've
changed it to lz4 also didn't help a lot, so I've changed it to avoid
*any* compression

We had no compression set on the table

- switched to temporary table to avoid I/O as much as possible,
wal_level=minimal too

We didn't use a temporary table.

If you think that's helpful,
I can try to dig out what we used to benchmark
the receive buffer size impact.

Regards,

Phil