Increase NUM_XLOGINSERT_LOCKS
Good day, hackers.
Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU)
servers. [1]/messages/by-id/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com
While I believe lock-free reservation make sense on huge server, it is
hard to measure on small servers and personal computers/notebooks.
But increase of NUM_XLOGINSERT_LOCKS have measurable performance gain
(using synthetic test) even on my working notebook:
Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
Test scenario:
- changes to config
max_connections = 1000
shared_buffers = 1024MB
fsync = off
synchronous_commit = off
wal_sync_method = fdatasync
full_page_writes = off
wal_buffers = 1024MB
checkpoint_timeout = 1d
- table to test:
create table test1(id int8 not null);
create index test1_ix_id on test1(id);
- testing script, which inserts and deletes a lot of tuples:
\set id random(1, 1000000000000000)
begin;
insert into test1
select i
from generate_series(:id::int8, :id::int8 + 1000) as i;
delete from test1 where id >= :id::int8 and id <= :id::int8 + 1000;
end;
- way to run benchmark:
for i in 1 2 3 ; do
install/bin/pgbench -n -T 20 -j 100 -c 200 \
-M prepared -f test1.sql postgres
done;
install/bin/psql postgres -c 'truncate test1; checkpoint;'
I've tested:
- with 200 clients (-j 100 -c 200) and 400 clients (-j 200 -c 400):
- increasing NUM_XLOGINSERT_LOCKS to 64/128/256
- change in WALInsertLockAcquire with attempts on conditional locking
with 1 or 2 attempts (v0-0002-several-attempts-to-lock...).
Results are (min/avg/max tps):
18 at commit e28033fe1af8
200 clients: 420/421/427
400 clients: 428/443/444
locks 64
200 clients: 576/591/599
400 clients: 552/575/578
locks 64 + attempt=1
200 clients: 648/680/687
400 clients: 616/640/667
locks 64 + attempt=2
200 clients: 676/696/712
400 clients: 625/654/667
locks 128
200 clients: 651/665/685
400 clients: 620/651/666
locks 128 + attempt=1
200 clients: 676/678/689
400 clients: 628/652/676
locks 128 + attempt=2
200 clients: 636/675/695
400 clients: 618/658/672
locks 256
200 clients: 672/678/716
400 clients: 625/658/674
locks 256 + attempt=1
200 clients: 673/687/702
400 clients: 624/657/669
locks 256 + attempt=2
200 clients: 664/695/697
400 clients: 622/648/672
(Reminder: each transaction is insertion and deletion of 1000 tuples in
table with 1 index).
Conclusions:
- without attempt to conditional lock it worth to increase
NUM_XLOGINSERT_LOCK up to huge 256 entries.
- with 2 attempts to conditional lock it is enough (on my notebook) to
increase just to 64 entries.
- on huge number of locks (256), attempts to conditional lock slightly
degrades performance. On 128 there is no clear result, imho.
I propose increase NUM_XLOGINSERT_LOCK to 64 locks + 2 attempts to lock.
I think, it is more conservative choice.
Alternatively it should be increased at least to 128 locks.
To validate proposed change I ran pgbench with:
install/bin/pgbench -i -s 50 postgres
for i in 1 2 3 ; do
install/bin/pgbench -n -T 20 -j 100 -c 100 -M prepared postgres
done
Results:
18 e28033fe1af8
100 clients: 18648/18708/18750
400 clients: 13232/13329/13410
locks 64 + second chance2:
100 clients: 19939/20048/20168
400 clients: 13394/13394/13888
As you see, on 100 clients proposed change give ~6.5% gain in TPS.
(Note: configuration was the same, ie fsync=off, synchronous_commit=off,
etc)
After NUM_XLOGINSERT_LOCK increase will be settled in master branch, I
believe lock-free reservation should be looked at closer.
[1]: /messages/by-id/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com
/messages/by-id/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com
-----
regards
Yura Sokolov aka funny-falcon
Attachments:
v0-0001-Increase-NUM_XLOGINSERT_LOCKS-to-64.patchtext/x-patch; charset=UTF-8; name=v0-0001-Increase-NUM_XLOGINSERT_LOCKS-to-64.patchDownload
From 93a4d4a7e2219a952c2a544047c19db9f0f0f5c0 Mon Sep 17 00:00:00 2001
From: Yura Sokolov <y.sokolov@postgrespro.ru>
Date: Thu, 16 Jan 2025 15:06:59 +0300
Subject: [PATCH v0 1/2] Increase NUM_XLOGINSERT_LOCKS to 64
---
src/backend/access/transam/xlog.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index bf3dbda901d..39381693db6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -147,7 +147,7 @@ int wal_segment_size = DEFAULT_XLOG_SEG_SIZE;
* to happen concurrently, but adds some CPU overhead to flushing the WAL,
* which needs to iterate all the locks.
*/
-#define NUM_XLOGINSERT_LOCKS 8
+#define NUM_XLOGINSERT_LOCKS 64
/*
* Max distance from last checkpoint, before triggering a new xlog-based
@@ -1448,7 +1448,11 @@ WALInsertLockRelease(void)
{
int i;
- for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
+ /*
+ * LWLockRelease hopes we will release in reverse order for faster
+ * search in held_lwlocks.
+ */
+ for (i = NUM_XLOGINSERT_LOCKS - 1; i >= 0; i--)
LWLockReleaseClearVar(&WALInsertLocks[i].l.lock,
&WALInsertLocks[i].l.insertingAt,
0);
--
2.43.0
v0-0002-several-attempts-to-lock-WALInsertLocks.patchtext/x-patch; charset=UTF-8; name=v0-0002-several-attempts-to-lock-WALInsertLocks.patchDownload
From 382e0d7bcc4a5c462a4d67264f4adf91f6e4f573 Mon Sep 17 00:00:00 2001
From: Yura Sokolov <y.sokolov@postgrespro.ru>
Date: Thu, 16 Jan 2025 15:30:57 +0300
Subject: [PATCH v0 2/2] several attempts to lock WALInsertLocks
---
src/backend/access/transam/xlog.c | 47 ++++++++++++++++++-------------
1 file changed, 28 insertions(+), 19 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 39381693db6..c1a2e29fdb8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -68,6 +68,7 @@
#include "catalog/pg_database.h"
#include "common/controldata_utils.h"
#include "common/file_utils.h"
+#include "common/pg_prng.h"
#include "executor/instrument.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -1370,8 +1371,7 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, XLogRecData *rdata,
static void
WALInsertLockAcquire(void)
{
- bool immed;
-
+ int attempts = 2;
/*
* It doesn't matter which of the WAL insertion locks we acquire, so try
* the one we used last time. If the system isn't particularly busy, it's
@@ -1383,29 +1383,38 @@ WALInsertLockAcquire(void)
* (semi-)randomly. This allows the locks to be used evenly if you have a
* lot of very short connections.
*/
- static int lockToTry = -1;
+ static uint32 lockToTry = 0;
+ static uint32 lockDelta = 0;
- if (lockToTry == -1)
- lockToTry = MyProcNumber % NUM_XLOGINSERT_LOCKS;
- MyLockNo = lockToTry;
+ if (lockDelta == 0)
+ {
+ uint32 rng = pg_prng_uint32(&pg_global_prng_state);
+
+ lockToTry = rng % NUM_XLOGINSERT_LOCKS;
+ lockDelta = ((rng >> 16) % NUM_XLOGINSERT_LOCKS) | 1; /* must be odd */
+ }
/*
* The insertingAt value is initially set to 0, as we don't know our
* insert location yet.
*/
- immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);
- if (!immed)
- {
- /*
- * If we couldn't get the lock immediately, try another lock next
- * time. On a system with more insertion locks than concurrent
- * inserters, this causes all the inserters to eventually migrate to a
- * lock that no-one else is using. On a system with more inserters
- * than locks, it still helps to distribute the inserters evenly
- * across the locks.
- */
- lockToTry = (lockToTry + 1) % NUM_XLOGINSERT_LOCKS;
- }
+ MyLockNo = lockToTry;
+retry:
+ if (LWLockConditionalAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE))
+ return;
+ /*
+ * If we couldn't get the lock immediately, try another lock next
+ * time. On a system with more insertion locks than concurrent
+ * inserters, this causes all the inserters to eventually migrate to a
+ * lock that no-one else is using. On a system with more inserters
+ * than locks, it still helps to distribute the inserters evenly
+ * across the locks.
+ */
+ lockToTry = (lockToTry + lockDelta) % NUM_XLOGINSERT_LOCKS;
+ MyLockNo = lockToTry;
+ if (--attempts)
+ goto retry;
+ LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);
}
/*
--
2.43.0
Hi,
On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
Good day, hackers.
Zhiguo Zhow proposed to transform xlog reservation to lock-free algorighm to
increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]While I believe lock-free reservation make sense on huge server, it is hard
to measure on small servers and personal computers/notebooks.But increase of NUM_XLOGINSERT_LOCKS have measurable performance gain (using
synthetic test) even on my working notebook:Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
I've experimented with this in the past.
Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.
c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench -n -M prepared -c$c -j$c -f <(echo "SELECT pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";) -P1 -T15
On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload ran a few
times to ensure WAL is already allocated.
With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tps
The main reason is that the increase in insert locks puts a lot more pressure
on the spinlock. Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot of cacheline
pingpong.
On much larger machines this gets considerably worse. IIRC I saw something
like an 8x regression on a large machine in the past, but I couldn't find the
actual numbers anymore, so I wouldn't want to bet on it.
Greetings,
Andres Freund
Excuse me, Andres, I've found I've pressed wrong button when I sent this
letter first time, and it was sent only to you. So I'm sending the copy now.
Please, reply to this message with copy of your answer. Your answer is
really valuable to be published in the list.
16.01.2025 18:36, Andres Freund wrote:
Hi,
On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
Good day, hackers.
Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to
increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]
While I believe lock-free reservation make sense on huge server, it
is hard
to measure on small servers and personal computers/notebooks.
But increase of NUM_XLOGINSERT_LOCKS have measurable performance
gain (using
synthetic test) even on my working notebook:
Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
I've experimented with this in the past.
Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench -n
-M prepared -c$c -j$c -f <(echo "SELECT pg_logical_emit_message(true,
'test', repeat('0', 1024*1024));";) -P1 -T15
On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload
ran a few
times to ensure WAL is already allocated.
With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tps
So, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.
(And that is why I'm keen on smaller increase, like upto 64, not 128).
The main reason is that the increase in insert locks puts a lot more
pressure
on the spinlock.
That it addressed by Zhiguo Zhow and me in other thread [1]/messages/by-id/flat/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com. But
increasing NUM_XLOGINSERT_LOCKS gives benefits right now (at least on
smaller installations), and "lock-free reservation" should be measured
against it.
Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot of
cacheline
pingpong.
Waiting is done with LWLockWaitForVar, and there is no wait if
`insertingAt` is in future. It looks very efficient in master branch code.
On much larger machines this gets considerably worse. IIRC I saw
something
like an 8x regression on a large machine in the past, but I couldn't
find the
actual numbers anymore, so I wouldn't want to bet on it.
I believe, it should be remeasured.
[1]: /messages/by-id/flat/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com
/messages/by-id/flat/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com
------
regards
Yura
Since it seems Andres missed my request to send answer's copy,
here it is:
On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:
16.01.2025 18:36, Andres Freund пишет:
Hi,
On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
Good day, hackers.
Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to
increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]
While I believe lock-free reservation make sense on huge server, it
is hard
to measure on small servers and personal computers/notebooks.
But increase of NUM_XLOGINSERT_LOCKS have measurable performance
gain (using
synthetic test) even on my working notebook:
Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
I've experimented with this in the past.
Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench -n
-M prepared -c$c -j$c -f <(echo "SELECT pg_logical_emit_message(true,
'test', repeat('0', 1024*1024));";) -P1 -T15
On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload
ran a few
times to ensure WAL is already allocated.
With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tpsSo, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.(And that is why I'm keen on smaller increase, like upto 64, not 128).
Oops, I swapped the results around when reformatting the results, sorry!
It's
the opposite way. I.e. increasing the locks hurts.
Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS. This is a
slightly different disk (the other seems to have to go the way of the dodo),
so the results aren't expected to be exactly the same.
NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543
The main reason is that the increase in insert locks puts a lot more
pressure
on the spinlock.
That it addressed by Zhiguo Zhow and me in other thread [1]. But
increasing
NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measured
against it.
I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.
Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot of
cacheline
pingpong.
Waiting is done with LWLockWaitForVar, and there is no wait if
`insertingAt`
is in future. It looks very efficient in master branch code.
But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which just
iterates over all locks.
Greetings,
Andres Freund
On Sat, 18 Jan 2025 at 14:53, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
Since it seems Andres missed my request to send answer's copy,
here it is:On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:
16.01.2025 18:36, Andres Freund пишет:
Hi,
On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
Good day, hackers.
Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to
increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]
While I believe lock-free reservation make sense on huge server,
it is hard
to measure on small servers and personal computers/notebooks.
But increase of NUM_XLOGINSERT_LOCKS have measurable performance
gain (using
synthetic test) even on my working notebook:
Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
I've experimented with this in the past.
Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench
-n -M prepared -c$c -j$c -f <(echo "SELECT
pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";)
-P1 -T15On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload
ran a few
times to ensure WAL is already allocated.
With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tpsSo, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.(And that is why I'm keen on smaller increase, like upto 64, not 128).
Oops, I swapped the results around when reformatting the results,
sorry! It's
the opposite way. I.e. increasing the locks hurts.Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS. This is a
slightly different disk (the other seems to have to go the way of the dodo),
so the results aren't expected to be exactly the same.NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543The main reason is that the increase in insert locks puts a lot
more pressure
on the spinlock.
That it addressed by Zhiguo Zhow and me in other thread [1]. But
increasing
NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measuredagainst it.
I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lotof cacheline
pingpong.
Waiting is done with LWLockWaitForVar, and there is no wait if
`insertingAt`
is in future. It looks very efficient in master branch code.
But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which just
iterates over all locks.
Hi, Yura Sokolov
I tested the patch on Hygon C86 7490 64-core using benchmarksql 5.0 with
500 warehouses and 256 terminals run time 10 mins:
| case | min | avg | max |
|--------------------+--------------+--------------+--------------|
| master (4108440) | 891,225.77 | 904,868.75 | 913,708.17 |
| lock 64 | 1,007,716.95 | 1,012,013.22 | 1,018,674.00 |
| lock 64 attempt 1 | 1,016,716.07 | 1,017,735.55 | 1,019,328.36 |
| lock 64 attempt 2 | 1,015,328.31 | 1,018,147.74 | 1,021,513.14 |
| lock 128 | 1,010,147.38 | 1,014,128.11 | 1,018,672.01 |
| lock 128 attempt 1 | 1,018,154.79 | 1,023,348.35 | 1,031,365.42 |
| lock 128 attempt 2 | 1,013,245.56 | 1,018,984.78 | 1,023,696.00 |
I didn't NUM_XLOGINSERT_LOCKS with 16 and 32, however, I tested it with 256,
and got the following error:
2025-01-23 02:23:23.828 CST [333524] PANIC: too many LWLocks taken
I hope this test will be helpful.
--
Regrads,
Japin Li
HI Japin
Thank you for you test ,It seems NUM_XLOGINSERT_LOCKS 64 is great , I
think it doesn't need to grow much,What do you think?
Regards
On Thu, Jan 23, 2025 at 10:30 AM Japin Li <japinli@hotmail.com> wrote:
Show quoted text
On Sat, 18 Jan 2025 at 14:53, Yura Sokolov <y.sokolov@postgrespro.ru>
wrote:Since it seems Andres missed my request to send answer's copy,
here it is:On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:
16.01.2025 18:36, Andres Freund пишет:
Hi,
On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
Good day, hackers.
Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to
increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]
While I believe lock-free reservation make sense on huge server,
it is hard
to measure on small servers and personal computers/notebooks.
But increase of NUM_XLOGINSERT_LOCKS have measurable performance
gain (using
synthetic test) even on my working notebook:
Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
I've experimented with this in the past.
Unfortunately increasing it substantially can make the contention on
the
spinlock *substantially* worse.
c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench
-n -M prepared -c$c -j$c -f <(echo "SELECT
pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";)
-P1 -T15On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload
ran a few
times to ensure WAL is already allocated.
With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tpsSo, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.(And that is why I'm keen on smaller increase, like upto 64, not 128).
Oops, I swapped the results around when reformatting the results,
sorry! It's
the opposite way. I.e. increasing the locks hurts.Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS. This is a
slightly different disk (the other seems to have to go the way of thedodo),
so the results aren't expected to be exactly the same.
NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543The main reason is that the increase in insert locks puts a lot
more pressure
on the spinlock.
That it addressed by Zhiguo Zhow and me in other thread [1]. But
increasing
NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measuredagainst it.
I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lotof cacheline
pingpong.
Waiting is done with LWLockWaitForVar, and there is no wait if
`insertingAt`
is in future. It looks very efficient in master branch code.
But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which
just
iterates over all locks.
Hi, Yura Sokolov
I tested the patch on Hygon C86 7490 64-core using benchmarksql 5.0 with
500 warehouses and 256 terminals run time 10 mins:| case | min | avg | max |
|--------------------+--------------+--------------+--------------|
| master (4108440) | 891,225.77 | 904,868.75 | 913,708.17 |
| lock 64 | 1,007,716.95 | 1,012,013.22 | 1,018,674.00 |
| lock 64 attempt 1 | 1,016,716.07 | 1,017,735.55 | 1,019,328.36 |
| lock 64 attempt 2 | 1,015,328.31 | 1,018,147.74 | 1,021,513.14 |
| lock 128 | 1,010,147.38 | 1,014,128.11 | 1,018,672.01 |
| lock 128 attempt 1 | 1,018,154.79 | 1,023,348.35 | 1,031,365.42 |
| lock 128 attempt 2 | 1,013,245.56 | 1,018,984.78 | 1,023,696.00 |I didn't NUM_XLOGINSERT_LOCKS with 16 and 32, however, I tested it with
256,
and got the following error:2025-01-23 02:23:23.828 CST [333524] PANIC: too many LWLocks taken
I hope this test will be helpful.
--
Regrads,
Japin Li
23.01.2025 08:41, wenhui qiu wrote:
HI Japin
Thank you for you test ,It seems NUM_XLOGINSERT_LOCKS 64
is great , I think it doesn't need to grow much,What do you think?
I agree: while 128 shows small benefit, it is not as big at the moment.
Given there's other waiting issues (may) arise from increasing it, 64
seems to be sweet spot.
Probably in a future it could be increased more after other places will
be optimized.
Show quoted text
On Thu, Jan 23, 2025 at 10:30 AM Japin Li <japinli@hotmail.com
<mailto:japinli@hotmail.com>> wrote:On Sat, 18 Jan 2025 at 14:53, Yura Sokolov <y.sokolov@postgrespro.ru
<mailto:y.sokolov@postgrespro.ru>> wrote:Since it seems Andres missed my request to send answer's copy,
here it is:On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:
16.01.2025 18:36, Andres Freund пишет:
Hi,
On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
Good day, hackers.
Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to
increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]
While I believe lock-free reservation make sense on huge server,
it is hard
to measure on small servers and personal computers/notebooks.
But increase of NUM_XLOGINSERT_LOCKS have measurable performance
gain (using
synthetic test) even on my working notebook:
Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu
24.04
I've experimented with this in the past.
Unfortunately increasing it substantially can make the
contention on the
spinlock *substantially* worse.
c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench
-n -M prepared -c$c -j$c -f <(echo "SELECT
pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";)
-P1 -T15On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload
ran a few
times to ensure WAL is already allocated.
With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tpsSo, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.(And that is why I'm keen on smaller increase, like upto 64, not
128).
Oops, I swapped the results around when reformatting the results,
sorry! It's
the opposite way. I.e. increasing the locks hurts.Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS.
This is a
slightly different disk (the other seems to have to go the way of
the dodo),
so the results aren't expected to be exactly the same.
NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543The main reason is that the increase in insert locks puts a lot
more pressure
on the spinlock.
That it addressed by Zhiguo Zhow and me in other thread [1]. But
increasing
NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measuredagainst it.
I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lotof cacheline
pingpong.
Waiting is done with LWLockWaitForVar, and there is no wait if
`insertingAt`
is in future. It looks very efficient in master branch code.
But LWLockWaitForVar is called from WaitXLogInsertionsToFinish,
which just
iterates over all locks.
Hi, Yura Sokolov
I tested the patch on Hygon C86 7490 64-core using benchmarksql 5.0 with
500 warehouses and 256 terminals run time 10 mins:| case | min | avg | max |
|--------------------+--------------+--------------+--------------|
| master (4108440) | 891,225.77 | 904,868.75 | 913,708.17 |
| lock 64 | 1,007,716.95 | 1,012,013.22 | 1,018,674.00 |
| lock 64 attempt 1 | 1,016,716.07 | 1,017,735.55 | 1,019,328.36 |
| lock 64 attempt 2 | 1,015,328.31 | 1,018,147.74 | 1,021,513.14 |
| lock 128 | 1,010,147.38 | 1,014,128.11 | 1,018,672.01 |
| lock 128 attempt 1 | 1,018,154.79 | 1,023,348.35 | 1,031,365.42 |
| lock 128 attempt 2 | 1,013,245.56 | 1,018,984.78 | 1,023,696.00 |I didn't NUM_XLOGINSERT_LOCKS with 16 and 32, however, I tested it
with 256,
and got the following error:2025-01-23 02:23:23.828 CST [333524] PANIC: too many LWLocks taken
I hope this test will be helpful.
--
Regrads,
Japin Li
On Thu, 23 Jan 2025 at 15:50, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
23.01.2025 08:41, wenhui qiu wrote:
HI Japin
Thank you for you test ,It seems NUM_XLOGINSERT_LOCKS 64
is great , I think it doesn't need to grow much,What do you think?I agree: while 128 shows small benefit, it is not as big at the moment.
Given there's other waiting issues (may) arise from increasing it, 64
seems to be sweet spot.Probably in a future it could be increased more after other places
will be optimized.
+1.
--
Regrads,
Japin Li