Small LO_BUFSIZE slows down lo_import and lo_export in libpq

Started by Дмитрий Питаковabout 2 years ago3 messageshackers

dvpitakov@gmail.com

about 2 years ago

In libpq, in the code of the lo_import function, for each piece of a file
of 8KB in size, lo_write is called, which greatly slows down the work of
lo_import because lo_write sends a request and waits for a response. The
size of 8KB is specified in define LO_BUFSIZE which changed from 1KB to 8KB
24 years ago.
Why not increase the buffer size?

Jelte Fennema-Nio

postgres@jeltef.nl

about 2 years ago

In reply to: Дмитрий Питаков (#1)

Re: Small LO_BUFSIZE slows down lo_import and lo_export in libpq

On Fri, 21 Jun 2024 at 10:46, Дмитрий Питаков <dvpitakov@gmail.com> wrote:

Why not increase the buffer size?

I think changing the buffer size sounds like a reasonable idea, if
that speeds stuff up. But I think it would greatly help your case if
you showed the perf increase using a simple benchmark, especially if
people could run this benchmark on their own machines to reproduce.

Tom Lane

tgl@sss.pgh.pa.us

about 2 years ago

In reply to: Jelte Fennema-Nio (#2)

Re: Small LO_BUFSIZE slows down lo_import and lo_export in libpq

Jelte Fennema-Nio <postgres@jeltef.nl> writes:

On Fri, 21 Jun 2024 at 10:46, Дмитрий Питаков <dvpitakov@gmail.com> wrote:

Why not increase the buffer size?

I think changing the buffer size sounds like a reasonable idea, if
that speeds stuff up. But I think it would greatly help your case if
you showed the perf increase using a simple benchmark, especially if
people could run this benchmark on their own machines to reproduce.

Yeah. "Why not" is not a patch proposal, mainly because the correct
question is "what other size are you proposing?"

This is not something that we can just randomly whack around, either.
Both lo_import_internal and lo_export assume they can allocate the
buffer on the stack, which means you have to worry about available
stack space. As a concrete example, I believe that musl still
defaults to 128kB thread stack size, which means that a threaded
client program on that platform would definitely fail with
LO_BUFSIZE >= 128kB, and even 64kB would be not without risk.

We could dodge that objection by malloc'ing the buffer, which might
be a good thing to do anyway because it'd improve the odds of getting
a nicely-aligned buffer. But then you have to make the case that the
extra malloc and free isn't a net loss, which it could be for
not-very-large transfers.

So bottom line is that you absolutely need a test case whose
performance can be measured under different conditions.

regards, tom lane