Flushing large data immediately in pqcomm

Started by Melih Mutluover 2 years ago53 messages
Jump to latest
#1Melih Mutlu
m.melihmutlu@gmail.com

Hi hackers

I've been looking into ways to reduce the overhead we're having in pqcomm
and I'd like to propose a small patch to modify how socket_putmessage works.

Currently socket_putmessage copies any input data into the pqcomm send
buffer (PqSendBuffer) and the size of this buffer is 8K. When the send
buffer gets full, it's flushed and we continue to copy more data into the
send buffer until we have no data left to be sent.
Since the send buffer is flushed whenever it's full, I think we are safe to
say that if the size of input data is larger than the buffer size, which is
8K, then the buffer will be flushed at least once (or more, depends on the
input size) to store and all the input data.

Proposed change modifies socket_putmessage to send any data larger than
8K immediately without copying it into the send buffer. Assuming that the
send buffer would be flushed anyway due to reaching its limit, the patch
just gets rid of the copy part which seems unnecessary and sends data
without waiting.

This change affects places where pq_putmessage is used such as
pg_basebackup, COPY TO, walsender etc.

I did some experiments to see how the patch performs.
Firstly, I loaded ~5GB data into a table [1]CREATE TABLE test(id int, name text, time TIMESTAMP); INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100) AS name, NOW() AS time FROM generate_series(1, 100000000) AS i;, then ran "COPY test TO
STDOUT". Here are perf results of both the patch and HEAD

HEAD:
-   94,13%     0,22%  postgres  postgres           [.] DoCopyTo
  - 93,90% DoCopyTo
      - 91,80% CopyOneRowTo
         + 47,35% CopyAttributeOutText
         - 26,49% CopySendEndOfRow
            - 25,97% socket_putmessage
               - internal_putbytes
                  - 24,38% internal_flush
                     + secure_write
                  + 1,47% memcpy (inlined)
         + 14,69% FunctionCall1Coll
         + 1,94% appendBinaryStringInfo
         + 0,75% MemoryContextResetOnly
      + 1,54% table_scan_getnextslot (inlined)
Patch:
-   94,40%     0,30%  postgres  postgres           [.] DoCopyTo
   - 94,11% DoCopyTo
      - 92,41% CopyOneRowTo
         + 51,20% CopyAttributeOutText
         - 20,87% CopySendEndOfRow
            - 20,45% socket_putmessage
               - internal_putbytes
                  - 18,50% internal_flush (inlined)
                       internal_flush_buffer
                     + secure_write
                  + 1,61% memcpy (inlined)
         + 17,36% FunctionCall1Coll
         + 1,33% appendBinaryStringInfo
         + 0,93% MemoryContextResetOnly
      + 1,36% table_scan_getnextslot (inlined)

The patch brings a ~5% gain in socket_putmessage.

Also timed the pg_basebackup like:
time pg_basebackup -p 5432 -U replica_user -X none -c fast --no_maanifest
-D test

HEAD:
real 0m10,040s
user 0m0,768s
sys 0m7,758s

Patch:
real 0m8,882s
user 0m0,699s
sys 0m6,980s

It seems ~11% faster in this specific case.

I'd appreciate any feedback/thoughts.

[1]: CREATE TABLE test(id int, name text, time TIMESTAMP); INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100) AS name, NOW() AS time FROM generate_series(1, 100000000) AS i;
CREATE TABLE test(id int, name text, time TIMESTAMP);
INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100) AS
name, NOW() AS time FROM generate_series(1, 100000000) AS i;

Thanks,
--
Melih Mutlu
Microsoft

Attachments:

0001-Flush-large-data-immediately-in-pqcomm.patchapplication/octet-stream; name=0001-Flush-large-data-immediately-in-pqcomm.patchDownload+41-8
#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Melih Mutlu (#1)
Re: Flushing large data immediately in pqcomm

On 20/11/2023 14:21, Melih Mutlu wrote:

Hi hackers

I've been looking into ways to reduce the overhead we're having in
pqcomm and I'd like to propose a small patch to modify how
socket_putmessage works.

Currently socket_putmessage copies any input data into the pqcomm send
buffer (PqSendBuffer) and the size of this buffer is 8K. When the send
buffer gets full, it's flushed and we continue to copy more data into
the send buffer until we have no data left to be sent.
Since the send buffer is flushed whenever it's full, I think we are safe
to say that if the size of input data is larger than the buffer size,
which is 8K, then the buffer will be flushed at least once (or more,
depends on the input size) to store and all the input data.

Agreed, that's silly.

Proposed change modifies socket_putmessage to send any data larger than
8K immediately without copying it into the send buffer. Assuming that
the send buffer would be flushed anyway due to reaching its limit, the
patch just gets rid of the copy part which seems unnecessary and sends
data without waiting.

If there's already some data in PqSendBuffer, I wonder if it would be
better to fill it up with data, flush it, and then send the rest of the
data directly. Instead of flushing the partial data first. I'm afraid
that you'll make a tiny call to secure_write(), followed by a large one,
then a tine one again, and so forth. Especially when socket_putmessage
itself writes the msgtype and len, which are tiny, before the payload.

Perhaps we should invent a new pq_putmessage() function that would take
an input buffer with 5 bytes of space reserved before the payload.
pq_putmessage() could then fill in the msgtype and len bytes in the
input buffer and send that directly. (Not wedded to that particular API,
but something that would have the same effect)

This change affects places where pq_putmessage is used such as
pg_basebackup, COPY TO, walsender etc.

I did some experiments to see how the patch performs.
Firstly, I loaded ~5GB data into a table [1], then ran "COPY test TO
STDOUT". Here are perf results of both the patch and HEAD > ...
The patch brings a ~5% gain in socket_putmessage.

[1]
CREATE TABLE test(id int, name text, time TIMESTAMP);
INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100)
AS name, NOW() AS time FROM generate_series(1, 100000000) AS i;

I'm surprised by these results, because each row in that table is < 600
bytes. PqSendBufferSize is 8kB, so the optimization shouldn't kick in in
that test. Am I missing something?

--
Heikki Linnakangas
Neon (https://neon.tech)

#3Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#2)
Re: Flushing large data immediately in pqcomm

On Mon, Jan 29, 2024 at 11:12 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Agreed, that's silly.

+1.

If there's already some data in PqSendBuffer, I wonder if it would be
better to fill it up with data, flush it, and then send the rest of the
data directly. Instead of flushing the partial data first. I'm afraid
that you'll make a tiny call to secure_write(), followed by a large one,
then a tine one again, and so forth. Especially when socket_putmessage
itself writes the msgtype and len, which are tiny, before the payload.

Perhaps we should invent a new pq_putmessage() function that would take
an input buffer with 5 bytes of space reserved before the payload.
pq_putmessage() could then fill in the msgtype and len bytes in the
input buffer and send that directly. (Not wedded to that particular API,
but something that would have the same effect)

I share the concern; I'm not sure about the best solution. I wonder if
it would be useful to have pq_putmessagev() in the style of writev()
et al. Or maybe what we need is secure_writev().

I also wonder if the threshold for sending data directly should be
smaller than the buffer size, and/or whether it should depend on the
buffer being empty. If we have an 8kB buffer that currently has
nothing in it, and somebody writes 2kB, I suspect it might be wrong to
copy that into the buffer. If the same buffer had 5kB used and 3kB
free, copying sounds a lot more likely to work out. The goal here is
probably to condense sequences of short messages into a single
transmission while sending long messages individually. I'm just not
quite sure what heuristic would do that most effectively.

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Heikki Linnakangas (#2)
Re: Flushing large data immediately in pqcomm

Hi Heikki,

Heikki Linnakangas <hlinnaka@iki.fi>, 29 Oca 2024 Pzt, 19:12 tarihinde şunu
yazdı:

Proposed change modifies socket_putmessage to send any data larger than
8K immediately without copying it into the send buffer. Assuming that
the send buffer would be flushed anyway due to reaching its limit, the
patch just gets rid of the copy part which seems unnecessary and sends
data without waiting.

If there's already some data in PqSendBuffer, I wonder if it would be
better to fill it up with data, flush it, and then send the rest of the
data directly. Instead of flushing the partial data first. I'm afraid
that you'll make a tiny call to secure_write(), followed by a large one,
then a tine one again, and so forth. Especially when socket_putmessage
itself writes the msgtype and len, which are tiny, before the payload.

I agree that I could do better there without flushing twice for both
PqSendBuffer and
input data. PqSendBuffer always has some data, even if it's tiny, since
msgtype and len are added.

Perhaps we should invent a new pq_putmessage() function that would take
an input buffer with 5 bytes of space reserved before the payload.
pq_putmessage() could then fill in the msgtype and len bytes in the
input buffer and send that directly. (Not wedded to that particular API,
but something that would have the same effect)

I thought about doing this. The reason why I didn't was because I think
that such a change would require adjusting all input buffers wherever
pq_putmessage is called, and I did not want to touch that many different
places. These places where we need pq_putmessage might not be that many
though, I'm not sure.

This change affects places where pq_putmessage is used such as
pg_basebackup, COPY TO, walsender etc.

I did some experiments to see how the patch performs.
Firstly, I loaded ~5GB data into a table [1], then ran "COPY test TO
STDOUT". Here are perf results of both the patch and HEAD > ...
The patch brings a ~5% gain in socket_putmessage.

[1]
CREATE TABLE test(id int, name text, time TIMESTAMP);
INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100)
AS name, NOW() AS time FROM generate_series(1, 100000000) AS i;

I'm surprised by these results, because each row in that table is < 600
bytes. PqSendBufferSize is 8kB, so the optimization shouldn't kick in in
that test. Am I missing something?

You're absolutely right. I made a silly mistake there. I also think that
the way I did perf analysis does not make much sense, even if one row of
the table is greater than 8kB.
Here are some quick timing results after being sure that it triggers this
patch's optimization. I need to think more on how to profile this with
perf. I hope to share proper results soon.

I just added a bit more zeros [1]INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 10000) AS name, NOW() AS time FROM generate_series(1, 1000000) AS i; and ran [2]rm /tmp/dummy && echo 3 | sudo tee /proc/sys/vm/drop_caches && time psql -d postgres -c "COPY test TO STDOUT;" > /tmp/dummy (hopefully measured the
correct thing)

HEAD:
real 2m48,938s
user 0m9,226s
sys 1m35,342s

Patch:
real 2m40,690s
user 0m8,492s
sys 1m31,001s

[1]: INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 10000) AS name, NOW() AS time FROM generate_series(1, 1000000) AS i;
INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 10000)
AS name, NOW() AS time FROM generate_series(1, 1000000) AS i;

[2]: rm /tmp/dummy && echo 3 | sudo tee /proc/sys/vm/drop_caches && time psql -d postgres -c "COPY test TO STDOUT;" > /tmp/dummy
rm /tmp/dummy && echo 3 | sudo tee /proc/sys/vm/drop_caches && time psql
-d postgres -c "COPY test TO STDOUT;" > /tmp/dummy

Thanks,
--
Melih Mutlu
Microsoft

#5Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Robert Haas (#3)
Re: Flushing large data immediately in pqcomm

Hi Robert,

Robert Haas <robertmhaas@gmail.com>, 29 Oca 2024 Pzt, 20:48 tarihinde şunu
yazdı:

If there's already some data in PqSendBuffer, I wonder if it would be
better to fill it up with data, flush it, and then send the rest of the
data directly. Instead of flushing the partial data first. I'm afraid
that you'll make a tiny call to secure_write(), followed by a large one,
then a tine one again, and so forth. Especially when socket_putmessage
itself writes the msgtype and len, which are tiny, before the payload.

Perhaps we should invent a new pq_putmessage() function that would take
an input buffer with 5 bytes of space reserved before the payload.
pq_putmessage() could then fill in the msgtype and len bytes in the
input buffer and send that directly. (Not wedded to that particular API,
but something that would have the same effect)

I share the concern; I'm not sure about the best solution. I wonder if
it would be useful to have pq_putmessagev() in the style of writev()
et al. Or maybe what we need is secure_writev().

I thought about using writev() for not only pq_putmessage() but
pq_putmessage_noblock() too. Currently, pq_putmessage_noblock()
repallocs PqSendBuffer
and copies input buffer, which can easily be larger than 8kB, into
PqSendBuffer.I
also discussed it with Thomas off-list. The thing is that I believe we
would need secure_writev() with SSL/GSS cases handled properly. I'm just
not sure if the effort would be worthwhile considering what we gain from it.

I also wonder if the threshold for sending data directly should be
smaller than the buffer size, and/or whether it should depend on the
buffer being empty.

You might be right. I'm not sure what the ideal threshold would be.

If we have an 8kB buffer that currently has
nothing in it, and somebody writes 2kB, I suspect it might be wrong to
copy that into the buffer. If the same buffer had 5kB used and 3kB
free, copying sounds a lot more likely to work out. The goal here is
probably to condense sequences of short messages into a single
transmission while sending long messages individually. I'm just not
quite sure what heuristic would do that most effectively.

Sounds like it's difficult to come up with a heuristic that would work well
enough for most cases.
One thing with sending data instead of copying it if the buffer is empty is
that initially the buffer is empty. I believe it will stay empty forever if
we do not copy anything when the buffer is empty. We can maybe simply set
the threshold to the buffer size/2 (4kB) and hope that will work better. Or
copy the data only if it fits into the remaining space in the buffer. What
do you think?

An additional note while I mentioned pq_putmessage_noblock(), I've been
testing sending input data immediately in pq_putmessage_noblock() without
blocking and copy the data into PqSendBuffer only if the socket would block
and cannot send it. Unfortunately, I don't have strong numbers to
demonstrate any improvement in perf or timing yet. But I still like to know
what would you think about it?

Thanks,
--
Melih Mutlu
Microsoft

#6Robert Haas
robertmhaas@gmail.com
In reply to: Melih Mutlu (#5)
Re: Flushing large data immediately in pqcomm

On Tue, Jan 30, 2024 at 12:58 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:

Sounds like it's difficult to come up with a heuristic that would work well enough for most cases.
One thing with sending data instead of copying it if the buffer is empty is that initially the buffer is empty. I believe it will stay empty forever if we do not copy anything when the buffer is empty. We can maybe simply set the threshold to the buffer size/2 (4kB) and hope that will work better. Or copy the data only if it fits into the remaining space in the buffer. What do you think?

An additional note while I mentioned pq_putmessage_noblock(), I've been testing sending input data immediately in pq_putmessage_noblock() without blocking and copy the data into PqSendBuffer only if the socket would block and cannot send it. Unfortunately, I don't have strong numbers to demonstrate any improvement in perf or timing yet. But I still like to know what would you think about it?

I think this is an area where it's very difficult to foresee on
theoretical grounds what will be right in practice. The problem is
that the best algorithm probably depends on what usage patterns are
common in practice. I think one common usage pattern will be a bunch
of roughly equal-sized messages in a row, like CopyData or DataRow
messages -- but those messages won't have a consistent width. It would
probably be worth testing what behavior you see in such cases -- start
with say a stream of 100 byte messages and then gradually increase and
see how the behavior evolves.

But you can also have other patterns, with messages of different sizes
interleaved. In the case of FE-to-BE traffic, the extended query
protocol might be a good example of that: the Parse message could be
quite long, or not, but the Bind Describe Execute Sync messages that
follow are probably all short. That case doesn't arise in this
direction, but I can't think exactly of what cases that do. It seems
like someone would need to play around and try some different cases
and maybe log the sizes of the secure_write() calls with various
algorithms, and then try to figure out what's best. For example, if
the alternating short-write, long-write behavior that Heikki mentioned
is happening, and I do think that particular thing is a very real
risk, then you haven't got it figured out yet...

--
Robert Haas
EDB: http://www.enterprisedb.com

#7Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Robert Haas (#6)
Re: Flushing large data immediately in pqcomm

On Tue, 30 Jan 2024 at 19:48, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Jan 30, 2024 at 12:58 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:

Sounds like it's difficult to come up with a heuristic that would work well enough for most cases.
One thing with sending data instead of copying it if the buffer is empty is that initially the buffer is empty. I believe it will stay empty forever if we do not copy anything when the buffer is empty. We can maybe simply set the threshold to the buffer size/2 (4kB) and hope that will work better. Or copy the data only if it fits into the remaining space in the buffer. What do you think?

An additional note while I mentioned pq_putmessage_noblock(), I've been testing sending input data immediately in pq_putmessage_noblock() without blocking and copy the data into PqSendBuffer only if the socket would block and cannot send it. Unfortunately, I don't have strong numbers to demonstrate any improvement in perf or timing yet. But I still like to know what would you think about it?

I think this is an area where it's very difficult to foresee on
theoretical grounds what will be right in practice

I agree that it's hard to prove that such heuristics will always be
better in practice than the status quo. But I feel like we shouldn't
let perfect be the enemy of good here. I one approach that is a clear
improvement over the status quo is:
1. If the buffer is empty AND the data we are trying to send is larger
than the buffer size, then don't use the buffer.
2. If not, fill up the buffer first (just like we do now) then send
that. And if the left over data is then still larger than the buffer,
then now the buffer is empty so 1. applies.

#8Robert Haas
robertmhaas@gmail.com
In reply to: Jelte Fennema-Nio (#7)
Re: Flushing large data immediately in pqcomm

On Tue, Jan 30, 2024 at 6:39 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

I agree that it's hard to prove that such heuristics will always be
better in practice than the status quo. But I feel like we shouldn't
let perfect be the enemy of good here.

Sure, I agree.

I one approach that is a clear
improvement over the status quo is:
1. If the buffer is empty AND the data we are trying to send is larger
than the buffer size, then don't use the buffer.
2. If not, fill up the buffer first (just like we do now) then send
that. And if the left over data is then still larger than the buffer,
then now the buffer is empty so 1. applies.

That seems like it might be a useful refinement of Melih Mutlu's
original proposal, but consider a message stream that consists of
messages exactly 8kB in size. If that message stream begins when the
buffer is empty, all messages are sent directly. If it begins when
there are any number of bytes in the buffer, we buffer every message
forever. That's kind of an odd artifact, but maybe it's fine in
practice. I say again that it's good to test out a bunch of scenarios
and see what shakes out.

--
Robert Haas
EDB: http://www.enterprisedb.com

#9Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Robert Haas (#8)
Re: Flushing large data immediately in pqcomm

On Wed, 31 Jan 2024 at 18:23, Robert Haas <robertmhaas@gmail.com> wrote:

That's kind of an odd artifact, but maybe it's fine in
practice.

I agree it's an odd artifact, but it's not a regression over the
status quo. Achieving that was the intent of my suggestion: A change
that improves some cases, but regresses nowhere.

I say again that it's good to test out a bunch of scenarios
and see what shakes out.

Testing a bunch of scenarios to find a good one sounds like a good
idea, which can probably give us a more optimal heuristic. But it also
sounds like a lot of work, and probably results in a lot of
discussion. That extra effort might mean that we're not going to
commit any change for PG17 (or even at all). If so, then I'd rather
have a modest improvement from my refinement of Melih's proposal, than
none at all.

#10Robert Haas
robertmhaas@gmail.com
In reply to: Jelte Fennema-Nio (#9)
Re: Flushing large data immediately in pqcomm

On Wed, Jan 31, 2024 at 12:49 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

Testing a bunch of scenarios to find a good one sounds like a good
idea, which can probably give us a more optimal heuristic. But it also
sounds like a lot of work, and probably results in a lot of
discussion. That extra effort might mean that we're not going to
commit any change for PG17 (or even at all). If so, then I'd rather
have a modest improvement from my refinement of Melih's proposal, than
none at all.

Personally, I don't think it's likely that anything will get committed
here without someone doing more legwork than I've seen on the thread
so far. I don't have any plan to pick up this patch anyway, but if I
were thinking about it, I would abandon the idea unless I were
prepared to go test a bunch of stuff myself. I agree with the core
idea of this work, but not with the idea that the bar is as low as "if
it can't lose relative to today, it's good enough."

Of course, another committer may see it differently.

--
Robert Haas
EDB: http://www.enterprisedb.com

#11Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Robert Haas (#8)
Re: Flushing large data immediately in pqcomm

Robert Haas <robertmhaas@gmail.com>, 31 Oca 2024 Çar, 20:23 tarihinde şunu
yazdı:

On Tue, Jan 30, 2024 at 6:39 PM Jelte Fennema-Nio <postgres@jeltef.nl>
wrote:

I agree that it's hard to prove that such heuristics will always be
better in practice than the status quo. But I feel like we shouldn't
let perfect be the enemy of good here.

Sure, I agree.

I one approach that is a clear
improvement over the status quo is:
1. If the buffer is empty AND the data we are trying to send is larger
than the buffer size, then don't use the buffer.
2. If not, fill up the buffer first (just like we do now) then send
that. And if the left over data is then still larger than the buffer,
then now the buffer is empty so 1. applies.

That seems like it might be a useful refinement of Melih Mutlu's
original proposal, but consider a message stream that consists of
messages exactly 8kB in size. If that message stream begins when the
buffer is empty, all messages are sent directly. If it begins when
there are any number of bytes in the buffer, we buffer every message
forever. That's kind of an odd artifact, but maybe it's fine in
practice. I say again that it's good to test out a bunch of scenarios
and see what shakes out.

Isn't this already the case? Imagine sending exactly 8kB messages, the
first pq_putmessage() call will buffer 8kB. Any call after this point
simply sends a 8kB message already buffered from the previous call and
buffers a new 8kB message. Only difference here is we keep the message in
the buffer for a while instead of sending it directly. In theory, the
proposed idea should not bring any difference in the number of flushes and
the size of data we send in each time, but can remove unnecessary copies to
the buffer in this case. I guess the behaviour is also the same with or
without the patch in case the buffer has already some bytes.

Robert Haas <robertmhaas@gmail.com>, 31 Oca 2024 Çar, 21:28 tarihinde şunu
yazdı:

Personally, I don't think it's likely that anything will get committed
here without someone doing more legwork than I've seen on the thread
so far. I don't have any plan to pick up this patch anyway, but if I
were thinking about it, I would abandon the idea unless I were
prepared to go test a bunch of stuff myself. I agree with the core
idea of this work, but not with the idea that the bar is as low as "if
it can't lose relative to today, it's good enough."

You're right and I'm open to doing more legwork. I'd also appreciate any
suggestion about how to test this properly and/or useful scenarios to test.
That would be really helpful.

I understand that I should provide more/better analysis around this change
to prove that it doesn't hurt (hopefully) but improves some cases even
though not all the cases. That may even help us to find a better approach
than what's already proposed. Just to clarify, I don't think anyone here
suggests that the bar should be at "if it can't lose relative to today,
it's good enough". IMHO "a change that improves some cases, but regresses
nowhere" does not translate to that.

Thanks,
--
Melih Mutlu
Microsoft

#12Robert Haas
robertmhaas@gmail.com
In reply to: Melih Mutlu (#11)
Re: Flushing large data immediately in pqcomm

On Wed, Jan 31, 2024 at 2:23 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:

That seems like it might be a useful refinement of Melih Mutlu's
original proposal, but consider a message stream that consists of
messages exactly 8kB in size. If that message stream begins when the
buffer is empty, all messages are sent directly. If it begins when
there are any number of bytes in the buffer, we buffer every message
forever. That's kind of an odd artifact, but maybe it's fine in
practice. I say again that it's good to test out a bunch of scenarios
and see what shakes out.

Isn't this already the case? Imagine sending exactly 8kB messages, the first pq_putmessage() call will buffer 8kB. Any call after this point simply sends a 8kB message already buffered from the previous call and buffers a new 8kB message. Only difference here is we keep the message in the buffer for a while instead of sending it directly. In theory, the proposed idea should not bring any difference in the number of flushes and the size of data we send in each time, but can remove unnecessary copies to the buffer in this case. I guess the behaviour is also the same with or without the patch in case the buffer has already some bytes.

Yes, it's never worse than today in terms of number of buffer flushes,
but it doesn't feel like great behavior, either. Users tend not to
like it when the behavior of an algorithm depends heavily on
incidental factors that shouldn't really be relevant, like whether the
buffer starts with 1 byte in it or 0 at the beginning of a long
sequence of messages. They see the performance varying "for no reason"
and they dislike it. They don't say "even the bad performance is no
worse than earlier versions so it's fine."

You're right and I'm open to doing more legwork. I'd also appreciate any suggestion about how to test this properly and/or useful scenarios to test. That would be really helpful.

I think experimenting to see whether the long-short-long-short
behavior that Heikki postulated emerges in practice would be a really
good start.

Another experiment that I think would be interesting is: suppose you
create a patch that sends EVERY message without buffering and compare
that to master. My naive expectation would be that this will lose if
you pump short messages through that connection and win if you pump
long messages through that connection. Is that true? If yes, at what
point do we break even on performance? Does it depend on whether the
connection is local or over a network? Does it depend on whether it's
with or without SSL? Does it depend on Linux vs. Windows vs.
whateverBSD? What happens if you twiddle the 8kB buffer size up or,
say, down to just below the Ethernet frame size?

I think that what we really want to understand here is under what
circumstances the extra layer of buffering is a win vs. being a loss.
If all the stuff I just mentioned doesn't really matter and the answer
is, say, that an 8kB buffer is great and the breakpoint where extra
buffering makes sense is also 8kB, and that's consistent regardless of
other variables, then your algorithm or Jelte's variant or something
of that nature is probably just right. But if it turns out, say, that
the extra buffering is only a win for sub-1kB messages, that would be
rather nice to know before we finalize the approach. Also, if it turns
out that the answer differs dramatically based on whether you're using
a UNIX socket or TCP, that would also be nice to know before
finalizing an algorithm.

I understand that I should provide more/better analysis around this change to prove that it doesn't hurt (hopefully) but improves some cases even though not all the cases. That may even help us to find a better approach than what's already proposed. Just to clarify, I don't think anyone here suggests that the bar should be at "if it can't lose relative to today, it's good enough". IMHO "a change that improves some cases, but regresses nowhere" does not translate to that.

Well, I thought those were fairly similar sentiments, so maybe I'm not
quite understanding the statement in the way it was meant.

--
Robert Haas
EDB: http://www.enterprisedb.com

#13Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#12)
Re: Flushing large data immediately in pqcomm

Hi,

On 2024-01-31 14:57:35 -0500, Robert Haas wrote:

You're right and I'm open to doing more legwork. I'd also appreciate any
suggestion about how to test this properly and/or useful scenarios to
test. That would be really helpful.

I think experimenting to see whether the long-short-long-short
behavior that Heikki postulated emerges in practice would be a really
good start.

Another experiment that I think would be interesting is: suppose you
create a patch that sends EVERY message without buffering and compare
that to master. My naive expectation would be that this will lose if
you pump short messages through that connection and win if you pump
long messages through that connection. Is that true? If yes, at what
point do we break even on performance? Does it depend on whether the
connection is local or over a network? Does it depend on whether it's
with or without SSL? Does it depend on Linux vs. Windows vs.
whateverBSD? What happens if you twiddle the 8kB buffer size up or,
say, down to just below the Ethernet frame size?

I feel like you're putting up a too high bar for something that can be a
pretty clear improvement on its own, without a downside. The current behaviour
is pretty absurd, doing all this research across all platforms isn't going to
disprove that - and it's a lot of work. ISTM we can analyze this without
taking concrete hardware into account easily enough.

One thing that I haven't seen mentioned here that's relevant around using
small buffers: Postgres uses TCP_NODELAY and has to do so. That means doing
tiny sends can hurt substantially

I think that what we really want to understand here is under what
circumstances the extra layer of buffering is a win vs. being a loss.

It's quite easy to see that doing no buffering isn't viable - we end up with
tiny tiny TCP packets, one for each send(). And then there's the syscall
overhead.

Here's a quickly thrown together benchmark using netperf. First with -D, which
instructs it to use TCP_NODELAY, as we do.

10gbit network, remote host:

$ (fields="request_size,throughput"; echo "$fields";for i in $(seq 0 16); do s=$((2**$i));netperf -P0 -t TCP_STREAM -l1 -H alap5-10gbe -- -r $s,$s -D 1 -o "$fields";done)|column -t -s,

request_size throughput
1 22.73
2 45.77
4 108.64
8 225.78
16 560.32
32 1035.61
64 2177.91
128 3604.71
256 5878.93
512 9334.70
1024 9031.13
2048 9405.35
4096 9334.60
8192 9275.33
16384 9406.29
32768 9385.52
65536 9399.40

localhost:
request_size throughput
1 2.76
2 5.10
4 9.89
8 20.51
16 43.42
32 87.13
64 173.72
128 343.70
256 647.89
512 1328.79
1024 2550.14
2048 4998.06
4096 9482.06
8192 17130.76
16384 29048.02
32768 42106.33
65536 48579.95

I'm slightly baffled by the poor performance of localhost with tiny packet
sizes. Ah, I see - it's the NODELA, without that:

localhost:
1 32.02
2 60.58
4 114.32
8 262.71
16 558.42
32 1053.66
64 2099.39
128 3815.60
256 6566.19
512 11751.79
1024 18976.11
2048 27222.99
4096 33838.07
8192 38219.60
16384 39146.37
32768 44784.98
65536 44214.70

NODELAY triggers many more context switches, because there's immediately data
available for the receiving side. Whereas with real network the interrupts get
coalesced.

I think that's pretty clear evidence that we need buffering. But I think we
can probably be smarter than we are right now, and then what's been proposed
in the patch. Because of TCP_NODELAY we shouldn't send a tiny buffer on its
own, it may trigger sending a small TCP packet, which is quite inefficient.

While not perfect - e.g. because networks might use jumbo packets / large MTUs
and we don't know how many outstanding bytes there are locally, I think a
decent heuristic could be to always try to send at least one packet worth of
data at once (something like ~1400 bytes), even if that requires copying some
of the input data. It might not be sent on its own, but it should make it
reasonably unlikely to end up with tiny tiny packets.

Greetings,

Andres Freund

#14Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#13)
Re: Flushing large data immediately in pqcomm

On Wed, Jan 31, 2024 at 10:24 PM Andres Freund <andres@anarazel.de> wrote:

While not perfect - e.g. because networks might use jumbo packets / large MTUs
and we don't know how many outstanding bytes there are locally, I think a
decent heuristic could be to always try to send at least one packet worth of
data at once (something like ~1400 bytes), even if that requires copying some
of the input data. It might not be sent on its own, but it should make it
reasonably unlikely to end up with tiny tiny packets.

I think that COULD be a decent heuristic but I think it should be
TESTED, including against the ~3 or so other heuristics proposed on
this thread, before we make a decision.

I literally mentioned the Ethernet frame size as one of the things
that we should test whether it's relevant in the exact email to which
you're replying, and you replied by proposing that as a heuristic, but
also criticizing me for wanting more research before we settle on
something. Are we just supposed to assume that your heuristic is
better than the others proposed here without testing anything, or,
like, what? I don't think this needs to be a completely exhaustive or
exhausting process, but I think trying a few different things out and
seeing what happens is smart.

--
Robert Haas
EDB: http://www.enterprisedb.com

#15Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#14)
Re: Flushing large data immediately in pqcomm

On Thu, Feb 1, 2024 at 10:52 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jan 31, 2024 at 10:24 PM Andres Freund <andres@anarazel.de> wrote:

While not perfect - e.g. because networks might use jumbo packets / large MTUs
and we don't know how many outstanding bytes there are locally, I think a
decent heuristic could be to always try to send at least one packet worth of
data at once (something like ~1400 bytes), even if that requires copying some
of the input data. It might not be sent on its own, but it should make it
reasonably unlikely to end up with tiny tiny packets.

I think that COULD be a decent heuristic but I think it should be
TESTED, including against the ~3 or so other heuristics proposed on
this thread, before we make a decision.

I literally mentioned the Ethernet frame size as one of the things
that we should test whether it's relevant in the exact email to which
you're replying, and you replied by proposing that as a heuristic, but
also criticizing me for wanting more research before we settle on
something. Are we just supposed to assume that your heuristic is
better than the others proposed here without testing anything, or,
like, what? I don't think this needs to be a completely exhaustive or
exhausting process, but I think trying a few different things out and
seeing what happens is smart.

There was probably a better way to phrase this email ... the sentiment
is sincere, but there was almost certainly a way of writing it that
didn't sound like I'm super-annoyed.

Apologies for that.

--
Robert Haas
EDB: http://www.enterprisedb.com

#16Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#15)
Re: Flushing large data immediately in pqcomm

Hi,

On 2024-02-01 15:02:57 -0500, Robert Haas wrote:

On Thu, Feb 1, 2024 at 10:52 AM Robert Haas <robertmhaas@gmail.com> wrote:
There was probably a better way to phrase this email ... the sentiment
is sincere, but there was almost certainly a way of writing it that
didn't sound like I'm super-annoyed.

NP - I could have phrased mine better as well...

On Wed, Jan 31, 2024 at 10:24 PM Andres Freund <andres@anarazel.de> wrote:

While not perfect - e.g. because networks might use jumbo packets / large MTUs
and we don't know how many outstanding bytes there are locally, I think a
decent heuristic could be to always try to send at least one packet worth of
data at once (something like ~1400 bytes), even if that requires copying some
of the input data. It might not be sent on its own, but it should make it
reasonably unlikely to end up with tiny tiny packets.

I think that COULD be a decent heuristic but I think it should be
TESTED, including against the ~3 or so other heuristics proposed on
this thread, before we make a decision.

I literally mentioned the Ethernet frame size as one of the things
that we should test whether it's relevant in the exact email to which
you're replying, and you replied by proposing that as a heuristic, but
also criticizing me for wanting more research before we settle on
something.

I mentioned the frame size thing because afaict nobody in the thread had
mentioned our use of TCP_NODELAY (which basically forces the kernel to send
out data immediately instead of waiting for further data to be sent). Without
that it'd be a lot less problematic to occasionally send data in small
increments inbetween larger sends. Nor would packet sizes be as relevant.

Are we just supposed to assume that your heuristic is better than the
others proposed here without testing anything, or, like, what? I don't
think this needs to be a completely exhaustive or exhausting process, but
I think trying a few different things out and seeing what happens is
smart.

I wasn't trying to say that my heuristic necessarily is better. What I was
trying to get at - and expressed badly - was that I doubt that testing can get
us all that far here. It's not too hard to test the effects of our buffering
with regards to syscall overhead, but once you actually take network effects
into account it gets quite hard. Bandwidth, latency, the specific network
hardware and operating systems involved all play a significant role. Given
how, uh, naive our current approach is, I think analyzing the situation from
first principles and then doing some basic validation of the results makes
more sense.

Separately, I think we shouldn't aim for perfect here. It's obviously
extremely inefficient to send a larger amount of data by memcpy()ing and
send()ing it in 8kB chunks. As mentioned by several folks upthread, we can
improve upon that without having worse behaviour than today. Medium-long term
I suspect we're going to want to use asynchronous network interfaces, in
combination with zero-copy sending, which requires larger changes. Not that
relevant for things like query results, quite relevant for base backups etc.

It's perhaps also worth mentioning that the small send buffer isn't great for
SSL performance, the encryption overhead increases when sending in small
chunks.

I hacked up Melih's patch to send the pending data together with the first bit
of the large "to be sent" data and also added a patch to increased
SINK_BUFFER_LENGTH by 16x. With a 12GB database I tested the time for
pg_basebackup -c fast -Ft --compress=none -Xnone -D - -d "$conn" > /dev/null

time via
test unix tcp tcp+ssl
master 6.305s 9.436s 15.596s
master-larger-buffer 6.535s 9.453s 15.208s
patch 5.900s 7.465s 13.634s
patch-larger-buffer 5.233s 5.439s 11.730s

The increase when using tcp is pretty darn impressive. If I had remembered in
time to disable manifests checksums, the win would have been even bigger.

The bottleneck for SSL is that it still ends up with ~16kB sends, not sure
why.

Greetings,

Andres Freund

#17Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Andres Freund (#16)
Re: Flushing large data immediately in pqcomm

Hi hackers,

I did some experiments with this patch, after previous discussions. This
probably does not answer all questions, but would be happy to do more if
needed.

First, I updated the patch according to what suggested here [1]/messages/by-id/CAGECzQTYUhnC1bO=zNiSpUgCs=hCYxVHvLD2doXNx3My6ZAC2w@mail.gmail.com. PSA v2.
I tweaked the master branch a bit to not allow any buffering. I compared
HEAD, this patch and no buffering at all.
I also added a simple GUC to control PqSendBufferSize, this change only
allows to modify the buffer size and should not have any impact on
performance.

I again ran the COPY TO STDOUT command and timed it. AFAIU COPY sends data
row by row, and I tried running the command under different scenarios with
different # of rows and row sizes. You can find the test script attached
(see test.sh).
All timings are in ms.

1- row size = 100 bytes, # of rows = 1000000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD │ 1036 │ 998 │ 940 │ 910 │ 894 │ 874 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch │ 1107 │ 1032 │ 980 │ 957 │ 917 │ 909 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 6230 │ 6125 │ 6282 │ 6279 │ 6255 │ 6221 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

2- row size = half of the rows are 1KB and rest is 10KB , # of rows =
1000000
┌───────────┬────────────┬───────┬───────┬───────┬───────┬───────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ HEAD │ 25197 │ 23414 │ 20612 │ 19206 │ 18334 │ 18033 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ patch │ 19843 │ 19889 │ 19798 │ 19129 │ 18578 │ 18260 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ no buffer │ 23752 │ 23565 │ 23602 │ 23622 │ 23541 │ 23599 │
└───────────┴────────────┴───────┴───────┴───────┴───────┴───────┘

3- row size = half of the rows are 1KB and rest is 1MB , # of rows = 1000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD │ 3137 │ 2937 │ 2687 │ 2551 │ 2456 │ 2465 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch │ 2399 │ 2390 │ 2402 │ 2415 │ 2417 │ 2422 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 2417 │ 2414 │ 2429 │ 2418 │ 2435 │ 2404 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

3- row size = all rows are 1MB , # of rows = 1000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD │ 6113 │ 5764 │ 5281 │ 5009 │ 4885 │ 4872 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch │ 4759 │ 4754 │ 4754 │ 4758 │ 4782 │ 4805 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 4756 │ 4774 │ 4793 │ 4766 │ 4770 │ 4774 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

Some quick observations:
1- Even though I expect both the patch and HEAD behave similarly in case of
small data (case 1: 100 bytes), the patch runs slightly slower than HEAD.
2- In cases where the data does not fit into the buffer, the patch starts
performing better than HEAD. For example, in case 2, patch seems faster
until the buffer size exceeds the data length. When the buffer size is set
to something larger than 10KB (16KB/32KB in this case), there is again a
slight performance loss with the patch as in case 1.
3- With large row sizes (i.e. sizes that do not fit into the buffer) not
buffering at all starts performing better than HEAD. Similarly the patch
performs better too as it stops buffering if data does not fit into the
buffer.

[1]: /messages/by-id/CAGECzQTYUhnC1bO=zNiSpUgCs=hCYxVHvLD2doXNx3My6ZAC2w@mail.gmail.com
/messages/by-id/CAGECzQTYUhnC1bO=zNiSpUgCs=hCYxVHvLD2doXNx3My6ZAC2w@mail.gmail.com

Thanks,
--
Melih Mutlu
Microsoft

Attachments:

add-pq_send_buffer_size-GUC.patchapplication/x-patch; name=add-pq_send_buffer_size-GUC.patchDownload+13-2
test.shtext/x-sh; charset=US-ASCII; name=test.shDownload
v2-0001-Flush-large-data-immediately-in-pqcomm.patchapplication/octet-stream; name=v2-0001-Flush-large-data-immediately-in-pqcomm.patchDownload+46-14
#18Robert Haas
robertmhaas@gmail.com
In reply to: Melih Mutlu (#17)
Re: Flushing large data immediately in pqcomm

On Thu, Mar 14, 2024 at 7:22 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:

1- Even though I expect both the patch and HEAD behave similarly in case of small data (case 1: 100 bytes), the patch runs slightly slower than HEAD.

I wonder why this happens. It seems like maybe something that could be fixed.

--
Robert Haas
EDB: http://www.enterprisedb.com

#19Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Melih Mutlu (#17)
Re: Flushing large data immediately in pqcomm

On Thu, 14 Mar 2024 at 12:22, Melih Mutlu <m.melihmutlu@gmail.com> wrote:

I did some experiments with this patch, after previous discussions

One thing I noticed is that the buffer sizes don't seem to matter much
in your experiments, even though Andres his expectation was that 1400
would be better. I think I know the reason for that:

afaict from your test.sh script you connect psql over localhost or
maybe even unix socket to postgres. Neither of those would not have an
MTU of 1500. You'd probably want to do those tests over an actual
network or at least change the MTU of the loopback interface. e.g. my
"lo" interface mtu is 65536 by default:

❯ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever

#20Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Melih Mutlu (#17)
Re: Flushing large data immediately in pqcomm

On 14/03/2024 13:22, Melih Mutlu wrote:

@@ -1282,14 +1283,32 @@ internal_putbytes(const char *s, size_t len)
if (internal_flush())
return EOF;
}
-		amount = PqSendBufferSize - PqSendPointer;
-		if (amount > len)
-			amount = len;
-		memcpy(PqSendBuffer + PqSendPointer, s, amount);
-		PqSendPointer += amount;
-		s += amount;
-		len -= amount;
+
+		/*
+		 * If the buffer is empty and data length is larger than the buffer
+		 * size, send it without buffering. Otherwise, put as much data as
+		 * possible into the buffer.
+		 */
+		if (!pq_is_send_pending() && len >= PqSendBufferSize)
+		{
+			int start = 0;
+
+			socket_set_nonblocking(false);
+			if (internal_flush_buffer(s, &start, (int *)&len))
+				return EOF;
+		}
+		else
+		{
+			amount = PqSendBufferSize - PqSendPointer;
+			if (amount > len)
+				amount = len;
+			memcpy(PqSendBuffer + PqSendPointer, s, amount);
+			PqSendPointer += amount;
+			s += amount;
+			len -= amount;
+		}
}
+
return 0;
}

Two small bugs:

- the "(int *) &len)" cast is not ok, and will break visibly on
big-endian systems where sizeof(int) != sizeof(size_t).

- If internal_flush_buffer() cannot write all the data in one call, it
updates 'start' for how much it wrote, and leaves 'end' unchanged. You
throw the updated 'start' value away, and will send the same data again
on next iteration.

Not a correctness issue, but instead of pq_is_send_pending(), I think it
would be better to check "PqSendStart == PqSendPointer" directly, or
call socket_is_send_pending() directly here. pq_is_send_pending() does
the same, but it's at a higher level of abstraction.

--
Heikki Linnakangas
Neon (https://neon.tech)

#21Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Robert Haas (#18)
#22David Rowley
dgrowleyml@gmail.com
In reply to: Jelte Fennema-Nio (#21)
#23David Rowley
dgrowleyml@gmail.com
In reply to: Heikki Linnakangas (#20)
#24Melih Mutlu
m.melihmutlu@gmail.com
In reply to: David Rowley (#22)
#25David Rowley
dgrowleyml@gmail.com
In reply to: Melih Mutlu (#24)
#26Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: David Rowley (#25)
#27Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Melih Mutlu (#24)
#28David Rowley
dgrowleyml@gmail.com
In reply to: Jelte Fennema-Nio (#26)
#29Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Heikki Linnakangas (#20)
#30Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Jelte Fennema-Nio (#27)
#31David Rowley
dgrowleyml@gmail.com
In reply to: Melih Mutlu (#30)
#32Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#31)
#33Melih Mutlu
m.melihmutlu@gmail.com
In reply to: David Rowley (#31)
#34Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Robert Haas (#32)
#35Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Melih Mutlu (#33)
#36Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Melih Mutlu (#35)
#37Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Jelte Fennema-Nio (#36)
#38David Rowley
dgrowleyml@gmail.com
In reply to: Melih Mutlu (#37)
#39Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: David Rowley (#38)
#40David Rowley
dgrowleyml@gmail.com
In reply to: Jelte Fennema-Nio (#39)
#41Andres Freund
andres@anarazel.de
In reply to: David Rowley (#38)
#42Ranier Vilela
ranier.vf@gmail.com
In reply to: Andres Freund (#41)
#43Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Andres Freund (#41)
#44Andres Freund
andres@anarazel.de
In reply to: Jelte Fennema-Nio (#43)
#45David Rowley
dgrowleyml@gmail.com
In reply to: Andres Freund (#41)
#46Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Andres Freund (#44)
#47David Rowley
dgrowleyml@gmail.com
In reply to: Jelte Fennema-Nio (#46)
#48Ranier Vilela
ranier.vf@gmail.com
In reply to: Andres Freund (#44)
#49Melih Mutlu
m.melihmutlu@gmail.com
In reply to: David Rowley (#38)
#50Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: David Rowley (#47)
#51Ranier Vilela
ranier.vf@gmail.com
In reply to: Jelte Fennema-Nio (#50)
#52Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Jelte Fennema-Nio (#50)
#53Ranier Vilela
ranier.vf@gmail.com
In reply to: Jelte Fennema-Nio (#52)