increase size of pg_commit_ts buffers
I wrote this patch last year in response to a customer issue and I
thought I had submitted it here, but evidently I didn't. So here it is.
The short story is: in commit 5364b357fb11 we increased the size of
pg_commit (née pg_clog) but we didn't increase the size of pg_commit_ts
to match. When commit_ts is in use, this can lead to significant buffer
thrashing and thus poor performance.
Since commit_ts entries are larger than pg_commit, my proposed patch uses
twice as many buffers.
Suffice it to say once we did this the customer problem went away.
(Andrey Borodin already has a patch to change the behavior for
multixact, which is something we should perhaps also do. I now notice
that they're also reporting a bug in that thread ... sigh)
--
Ãlvaro Herrera 39°49'30"S 73°17'W
"The problem with the future is that it keeps turning into the present"
(Hobbes)
Attachments:
On Fri, Jan 15, 2021 at 07:07:44PM -0300, Alvaro Herrera wrote:
I wrote this patch last year in response to a customer issue and I
thought I had submitted it here, but evidently I didn't. So here it is.The short story is: in commit 5364b357fb11 we increased the size of
pg_commit (née pg_clog) but we didn't increase the size of pg_commit_ts
to match. When commit_ts is in use, this can lead to significant buffer
thrashing and thus poor performance.Since commit_ts entries are larger than pg_commit, my proposed patch uses
twice as many buffers.
This is a step in the right direction. With commit_ts entries being forty
times as large as pg_xact, it's not self-evident that just twice as many
buffers is appropriate. Did you try other numbers? I'm fine with proceeding
even if not, but the comment should then admit that the new number was a guess
that solved problems for one site.
--- a/src/backend/access/transam/commit_ts.c +++ b/src/backend/access/transam/commit_ts.c @@ -530,7 +530,7 @@ pg_xact_commit_timestamp_origin(PG_FUNCTION_ARGS)
The comment right above here is outdated.
Show quoted text
Size CommitTsShmemBuffers(void) { - return Min(16, Max(4, NBuffers / 1024)); + return Min(256, Max(4, NBuffers / 512)); }/*
16 янв. 2021 г., в 03:07, Alvaro Herrera <alvherre@alvh.no-ip.org> написал(а):
Andrey Borodin already has a patch to change the behavior for
multixact, which is something we should perhaps also do. I now notice
that they're also reporting a bug in that thread ... sigh
I've tried in that thread [0]/messages/by-id/b4911e88-9969-aaba-f6be-ed57bd5fec36@darold.net to do more intelligent optimisation than just increase number of buffers.
Though, in fact bigger memory had dramatically better effect that lock tricks.
Maybe let's make all SLRUs buffer sizes configurable?
Best regards, Andrey Borodin.
[0]: /messages/by-id/b4911e88-9969-aaba-f6be-ed57bd5fec36@darold.net
On Mon, Feb 15, 2021 at 11:56 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
16 янв. 2021 г., в 03:07, Alvaro Herrera <alvherre@alvh.no-ip.org> написал(а):
Andrey Borodin already has a patch to change the behavior for
multixact, which is something we should perhaps also do. I now notice
that they're also reporting a bug in that thread ... sighI've tried in that thread [0] to do more intelligent optimisation than just increase number of buffers.
Though, in fact bigger memory had dramatically better effect that lock tricks.Maybe let's make all SLRUs buffer sizes configurable?
+1
I got interested in the SLRU sizing problem (the lock trick and CV
stuff sounds interesting too, but I didn't have time to review that in
this cycle). The main problem I'm aware of with it is the linear
search, so I tried to fix that. See Andrey's thread for details.
Hi,
I think this is ready to go. I was wondering why it merely doubles the
number of buffers, as described by previous comments. That seems like a
very tiny increase, so considering how much the hardware grew over the
last few years it'd probably fail to help some of the large boxes.
But it turns out that's not what the patch does. The change is this
- return Min(16, Max(4, NBuffers / 1024)); + return Min(256, Max(4, NBuffers / 512));
So it does two things - (a) it increases the maximum from 16 to 256 (so
16x), and (b) it doubles the speed how fast we get there. Until now we
add 1 buffer per 1024 shared buffers, and the maximum would be reached
with 128MB. The patch lowers the steps to 512, and the maximum to 1GB.
So this actually increases the number of commit_ts buffers 16x, not 2x.
That seems reasonable, I guess. The increase may be smaller for systems
with less that 1GB shared buffers, but IMO that's a tiny minority of
production systems busy enough for this patch to make a difference.
The other question is of course what overhead could this change have on
workload that does not have issues with commit_ts buffers (i.e. it's
using commit_ts, but would be fine with just the 16 buffers). But my
guess is this is negligible, based on how simple the SLRU code is and my
previous experiments with SLRU.
So +1 to just get this committed, as it is.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, 12 Nov 2021 at 17:39, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
So +1 to just get this committed, as it is.
This is a real issue affecting Postgres users. Please commit this soon.
--
Simon Riggs http://www.EnterpriseDB.com/
On 2021-Nov-30, Simon Riggs wrote:
On Fri, 12 Nov 2021 at 17:39, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:So +1 to just get this committed, as it is.
This is a real issue affecting Postgres users. Please commit this soon.
Uh ouch, I had forgotten that this patch was mine (blush). Thanks for
the ping, I pushed it yesterday. I added a comment.
--
Ãlvaro Herrera PostgreSQL Developer â https://www.EnterpriseDB.com/