Increase NUM_XLOGINSERT_LOCKS

Started by Yura Sokolov12 months ago8 messages
#1Yura Sokolov
y.sokolov@postgrespro.ru
2 attachment(s)

Good day, hackers.

Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU)
servers. [1]/messages/by-id/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com

While I believe lock-free reservation make sense on huge server, it is
hard to measure on small servers and personal computers/notebooks.

But increase of NUM_XLOGINSERT_LOCKS have measurable performance gain
(using synthetic test) even on my working notebook:

Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04

Test scenario:

- changes to config

max_connections = 1000
shared_buffers = 1024MB
fsync = off
synchronous_commit = off
wal_sync_method = fdatasync
full_page_writes = off
wal_buffers = 1024MB
checkpoint_timeout = 1d

- table to test:

create table test1(id int8 not null);
create index test1_ix_id on test1(id);

- testing script, which inserts and deletes a lot of tuples:

\set id random(1, 1000000000000000)

begin;
insert into test1
select i
from generate_series(:id::int8, :id::int8 + 1000) as i;
delete from test1 where id >= :id::int8 and id <= :id::int8 + 1000;
end;

- way to run benchmark:

for i in 1 2 3 ; do
install/bin/pgbench -n -T 20 -j 100 -c 200 \
-M prepared -f test1.sql postgres
done;
install/bin/psql postgres -c 'truncate test1; checkpoint;'

I've tested:
- with 200 clients (-j 100 -c 200) and 400 clients (-j 200 -c 400):
- increasing NUM_XLOGINSERT_LOCKS to 64/128/256
- change in WALInsertLockAcquire with attempts on conditional locking
with 1 or 2 attempts (v0-0002-several-attempts-to-lock...).

Results are (min/avg/max tps):

18 at commit e28033fe1af8
200 clients: 420/421/427
400 clients: 428/443/444
locks 64
200 clients: 576/591/599
400 clients: 552/575/578
locks 64 + attempt=1
200 clients: 648/680/687
400 clients: 616/640/667
locks 64 + attempt=2
200 clients: 676/696/712
400 clients: 625/654/667
locks 128
200 clients: 651/665/685
400 clients: 620/651/666
locks 128 + attempt=1
200 clients: 676/678/689
400 clients: 628/652/676
locks 128 + attempt=2
200 clients: 636/675/695
400 clients: 618/658/672
locks 256
200 clients: 672/678/716
400 clients: 625/658/674
locks 256 + attempt=1
200 clients: 673/687/702
400 clients: 624/657/669
locks 256 + attempt=2
200 clients: 664/695/697
400 clients: 622/648/672

(Reminder: each transaction is insertion and deletion of 1000 tuples in
table with 1 index).

Conclusions:
- without attempt to conditional lock it worth to increase
NUM_XLOGINSERT_LOCK up to huge 256 entries.
- with 2 attempts to conditional lock it is enough (on my notebook) to
increase just to 64 entries.
- on huge number of locks (256), attempts to conditional lock slightly
degrades performance. On 128 there is no clear result, imho.

I propose increase NUM_XLOGINSERT_LOCK to 64 locks + 2 attempts to lock.
I think, it is more conservative choice.
Alternatively it should be increased at least to 128 locks.

To validate proposed change I ran pgbench with:

install/bin/pgbench -i -s 50 postgres
for i in 1 2 3 ; do
install/bin/pgbench -n -T 20 -j 100 -c 100 -M prepared postgres
done

Results:

18 e28033fe1af8
100 clients: 18648/18708/18750
400 clients: 13232/13329/13410
locks 64 + second chance2:
100 clients: 19939/20048/20168
400 clients: 13394/13394/13888

As you see, on 100 clients proposed change give ~6.5% gain in TPS.

(Note: configuration was the same, ie fsync=off, synchronous_commit=off,
etc)

After NUM_XLOGINSERT_LOCK increase will be settled in master branch, I
believe lock-free reservation should be looked at closer.

[1]: /messages/by-id/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com
/messages/by-id/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com

-----

regards
Yura Sokolov aka funny-falcon

Attachments:

v0-0001-Increase-NUM_XLOGINSERT_LOCKS-to-64.patchtext/x-patch; charset=UTF-8; name=v0-0001-Increase-NUM_XLOGINSERT_LOCKS-to-64.patchDownload
From 93a4d4a7e2219a952c2a544047c19db9f0f0f5c0 Mon Sep 17 00:00:00 2001
From: Yura Sokolov <y.sokolov@postgrespro.ru>
Date: Thu, 16 Jan 2025 15:06:59 +0300
Subject: [PATCH v0 1/2] Increase NUM_XLOGINSERT_LOCKS to 64

---
 src/backend/access/transam/xlog.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index bf3dbda901d..39381693db6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -147,7 +147,7 @@ int			wal_segment_size = DEFAULT_XLOG_SEG_SIZE;
  * to happen concurrently, but adds some CPU overhead to flushing the WAL,
  * which needs to iterate all the locks.
  */
-#define NUM_XLOGINSERT_LOCKS  8
+#define NUM_XLOGINSERT_LOCKS  64
 
 /*
  * Max distance from last checkpoint, before triggering a new xlog-based
@@ -1448,7 +1448,11 @@ WALInsertLockRelease(void)
 	{
 		int			i;
 
-		for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
+		/*
+		 * LWLockRelease hopes we will release in reverse order for faster
+		 * search in held_lwlocks.
+		 */
+		for (i = NUM_XLOGINSERT_LOCKS - 1; i >= 0; i--)
 			LWLockReleaseClearVar(&WALInsertLocks[i].l.lock,
 								  &WALInsertLocks[i].l.insertingAt,
 								  0);
-- 
2.43.0

v0-0002-several-attempts-to-lock-WALInsertLocks.patchtext/x-patch; charset=UTF-8; name=v0-0002-several-attempts-to-lock-WALInsertLocks.patchDownload
From 382e0d7bcc4a5c462a4d67264f4adf91f6e4f573 Mon Sep 17 00:00:00 2001
From: Yura Sokolov <y.sokolov@postgrespro.ru>
Date: Thu, 16 Jan 2025 15:30:57 +0300
Subject: [PATCH v0 2/2] several attempts to lock WALInsertLocks

---
 src/backend/access/transam/xlog.c | 47 ++++++++++++++++++-------------
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 39381693db6..c1a2e29fdb8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -68,6 +68,7 @@
 #include "catalog/pg_database.h"
 #include "common/controldata_utils.h"
 #include "common/file_utils.h"
+#include "common/pg_prng.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
@@ -1370,8 +1371,7 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, XLogRecData *rdata,
 static void
 WALInsertLockAcquire(void)
 {
-	bool		immed;
-
+	int attempts = 2;
 	/*
 	 * It doesn't matter which of the WAL insertion locks we acquire, so try
 	 * the one we used last time.  If the system isn't particularly busy, it's
@@ -1383,29 +1383,38 @@ WALInsertLockAcquire(void)
 	 * (semi-)randomly.  This allows the locks to be used evenly if you have a
 	 * lot of very short connections.
 	 */
-	static int	lockToTry = -1;
+	static uint32 lockToTry = 0;
+	static uint32 lockDelta = 0;
 
-	if (lockToTry == -1)
-		lockToTry = MyProcNumber % NUM_XLOGINSERT_LOCKS;
-	MyLockNo = lockToTry;
+	if (lockDelta == 0)
+	{
+		uint32 rng = pg_prng_uint32(&pg_global_prng_state);
+
+		lockToTry = rng % NUM_XLOGINSERT_LOCKS;
+		lockDelta = ((rng >> 16) % NUM_XLOGINSERT_LOCKS) | 1; /* must be odd */
+	}
 
 	/*
 	 * The insertingAt value is initially set to 0, as we don't know our
 	 * insert location yet.
 	 */
-	immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);
-	if (!immed)
-	{
-		/*
-		 * If we couldn't get the lock immediately, try another lock next
-		 * time.  On a system with more insertion locks than concurrent
-		 * inserters, this causes all the inserters to eventually migrate to a
-		 * lock that no-one else is using.  On a system with more inserters
-		 * than locks, it still helps to distribute the inserters evenly
-		 * across the locks.
-		 */
-		lockToTry = (lockToTry + 1) % NUM_XLOGINSERT_LOCKS;
-	}
+	MyLockNo = lockToTry;
+retry:
+	if (LWLockConditionalAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE))
+		return;
+	/*
+	 * If we couldn't get the lock immediately, try another lock next
+	 * time.  On a system with more insertion locks than concurrent
+	 * inserters, this causes all the inserters to eventually migrate to a
+	 * lock that no-one else is using.  On a system with more inserters
+	 * than locks, it still helps to distribute the inserters evenly
+	 * across the locks.
+	 */
+	lockToTry = (lockToTry + lockDelta) % NUM_XLOGINSERT_LOCKS;
+	MyLockNo = lockToTry;
+	if (--attempts)
+		goto retry;
+	LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);
 }
 
 /*
-- 
2.43.0

#2Andres Freund
andres@anarazel.de
In reply to: Yura Sokolov (#1)
Re: Increase NUM_XLOGINSERT_LOCKS

Hi,

On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:

Good day, hackers.

Zhiguo Zhow proposed to transform xlog reservation to lock-free algorighm to
increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]

While I believe lock-free reservation make sense on huge server, it is hard
to measure on small servers and personal computers/notebooks.

But increase of NUM_XLOGINSERT_LOCKS have measurable performance gain (using
synthetic test) even on my working notebook:

Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04

I've experimented with this in the past.

Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.

c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench -n -M prepared -c$c -j$c -f <(echo "SELECT pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";) -P1 -T15

On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload ran a few
times to ensure WAL is already allocated.

With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tps

The main reason is that the increase in insert locks puts a lot more pressure
on the spinlock. Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot of cacheline
pingpong.

On much larger machines this gets considerably worse. IIRC I saw something
like an 8x regression on a large machine in the past, but I couldn't find the
actual numbers anymore, so I wouldn't want to bet on it.

Greetings,

Andres Freund

#3Yura Sokolov
y.sokolov@postgrespro.ru
In reply to: Andres Freund (#2)
Re: Increase NUM_XLOGINSERT_LOCKS

Excuse me, Andres, I've found I've pressed wrong button when I sent this
letter first time, and it was sent only to you. So I'm sending the copy now.

Please, reply to this message with copy of your answer. Your answer is
really valuable to be published in the list.

16.01.2025 18:36, Andres Freund wrote:

Hi,

On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:

Good day, hackers.

Zhiguo Zhow proposed to transform xlog reservation to lock-free

algorighm to

increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]

While I believe lock-free reservation make sense on huge server, it

is hard

to measure on small servers and personal computers/notebooks.

But increase of NUM_XLOGINSERT_LOCKS have measurable performance

gain (using

synthetic test) even on my working notebook:

Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04

I've experimented with this in the past.

Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.

c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench -n

-M prepared -c$c -j$c -f <(echo "SELECT pg_logical_emit_message(true,
'test', repeat('0', 1024*1024));";) -P1 -T15

On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload

ran a few

times to ensure WAL is already allocated.

With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tps

So, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.

(And that is why I'm keen on smaller increase, like upto 64, not 128).

The main reason is that the increase in insert locks puts a lot more

pressure

on the spinlock.

That it addressed by Zhiguo Zhow and me in other thread [1]/messages/by-id/flat/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com. But
increasing NUM_XLOGINSERT_LOCKS gives benefits right now (at least on
smaller installations), and "lock-free reservation" should be measured
against it.

Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot of

cacheline

pingpong.

Waiting is done with LWLockWaitForVar, and there is no wait if
`insertingAt` is in future. It looks very efficient in master branch code.

On much larger machines this gets considerably worse. IIRC I saw

something

like an 8x regression on a large machine in the past, but I couldn't

find the

actual numbers anymore, so I wouldn't want to bet on it.

I believe, it should be remeasured.

[1]: /messages/by-id/flat/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com
/messages/by-id/flat/PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com

------
regards
Yura

#4Yura Sokolov
y.sokolov@postgrespro.ru
In reply to: Yura Sokolov (#3)
Re: Increase NUM_XLOGINSERT_LOCKS

Since it seems Andres missed my request to send answer's copy,
here it is:

On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:

16.01.2025 18:36, Andres Freund пишет:

Hi,

On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:

Good day, hackers.

Zhiguo Zhow proposed to transform xlog reservation to lock-free

algorighm to

increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]

While I believe lock-free reservation make sense on huge server, it

is hard

to measure on small servers and personal computers/notebooks.

But increase of NUM_XLOGINSERT_LOCKS have measurable performance

gain (using

synthetic test) even on my working notebook:

Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04

I've experimented with this in the past.

Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.

c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench -n

-M prepared -c$c -j$c -f <(echo "SELECT pg_logical_emit_message(true,
'test', repeat('0', 1024*1024));";) -P1 -T15

On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload

ran a few

times to ensure WAL is already allocated.

With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tps

So, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.

(And that is why I'm keen on smaller increase, like upto 64, not 128).

Oops, I swapped the results around when reformatting the results, sorry!
It's
the opposite way. I.e. increasing the locks hurts.

Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS. This is a
slightly different disk (the other seems to have to go the way of the dodo),
so the results aren't expected to be exactly the same.

NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543

The main reason is that the increase in insert locks puts a lot more

pressure

on the spinlock.

That it addressed by Zhiguo Zhow and me in other thread [1]. But

increasing

NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measured

against it.

I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.

Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot of

cacheline

pingpong.

Waiting is done with LWLockWaitForVar, and there is no wait if

`insertingAt`

is in future. It looks very efficient in master branch code.

But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which just
iterates over all locks.

Greetings,

Andres Freund

#5Japin Li
japinli@hotmail.com
In reply to: Yura Sokolov (#4)
Re: Increase NUM_XLOGINSERT_LOCKS

On Sat, 18 Jan 2025 at 14:53, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

Since it seems Andres missed my request to send answer's copy,
here it is:

On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:

16.01.2025 18:36, Andres Freund пишет:

Hi,

On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:

Good day, hackers.

Zhiguo Zhow proposed to transform xlog reservation to lock-free

algorighm to

increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]

While I believe lock-free reservation make sense on huge server,

it is hard

to measure on small servers and personal computers/notebooks.

But increase of NUM_XLOGINSERT_LOCKS have measurable performance

gain (using

synthetic test) even on my working notebook:

Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04

I've experimented with this in the past.

Unfortunately increasing it substantially can make the contention on the
spinlock *substantially* worse.

c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench

-n -M prepared -c$c -j$c -f <(echo "SELECT
pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";)
-P1 -T15

On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload

ran a few

times to ensure WAL is already allocated.

With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tps

So, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.

(And that is why I'm keen on smaller increase, like upto 64, not 128).

Oops, I swapped the results around when reformatting the results,
sorry! It's
the opposite way. I.e. increasing the locks hurts.

Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS. This is a
slightly different disk (the other seems to have to go the way of the dodo),
so the results aren't expected to be exactly the same.

NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543

The main reason is that the increase in insert locks puts a lot

more pressure

on the spinlock.

That it addressed by Zhiguo Zhow and me in other thread [1]. But

increasing

NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measured

against it.

I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.

Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot

of cacheline

pingpong.

Waiting is done with LWLockWaitForVar, and there is no wait if

`insertingAt`

is in future. It looks very efficient in master branch code.

But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which just
iterates over all locks.

Hi, Yura Sokolov

I tested the patch on Hygon C86 7490 64-core using benchmarksql 5.0 with
500 warehouses and 256 terminals run time 10 mins:

| case | min | avg | max |
|--------------------+--------------+--------------+--------------|
| master (4108440) | 891,225.77 | 904,868.75 | 913,708.17 |
| lock 64 | 1,007,716.95 | 1,012,013.22 | 1,018,674.00 |
| lock 64 attempt 1 | 1,016,716.07 | 1,017,735.55 | 1,019,328.36 |
| lock 64 attempt 2 | 1,015,328.31 | 1,018,147.74 | 1,021,513.14 |
| lock 128 | 1,010,147.38 | 1,014,128.11 | 1,018,672.01 |
| lock 128 attempt 1 | 1,018,154.79 | 1,023,348.35 | 1,031,365.42 |
| lock 128 attempt 2 | 1,013,245.56 | 1,018,984.78 | 1,023,696.00 |

I didn't NUM_XLOGINSERT_LOCKS with 16 and 32, however, I tested it with 256,
and got the following error:

2025-01-23 02:23:23.828 CST [333524] PANIC: too many LWLocks taken

I hope this test will be helpful.

--
Regrads,
Japin Li

#6wenhui qiu
qiuwenhuifx@gmail.com
In reply to: Japin Li (#5)
Re: Increase NUM_XLOGINSERT_LOCKS

HI Japin
Thank you for you test ,It seems NUM_XLOGINSERT_LOCKS 64 is great , I
think it doesn't need to grow much,What do you think?

Regards

On Thu, Jan 23, 2025 at 10:30 AM Japin Li <japinli@hotmail.com> wrote:

Show quoted text

On Sat, 18 Jan 2025 at 14:53, Yura Sokolov <y.sokolov@postgrespro.ru>
wrote:

Since it seems Andres missed my request to send answer's copy,
here it is:

On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:

16.01.2025 18:36, Andres Freund пишет:

Hi,

On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:

Good day, hackers.

Zhiguo Zhow proposed to transform xlog reservation to lock-free

algorighm to

increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]

While I believe lock-free reservation make sense on huge server,

it is hard

to measure on small servers and personal computers/notebooks.

But increase of NUM_XLOGINSERT_LOCKS have measurable performance

gain (using

synthetic test) even on my working notebook:

Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04

I've experimented with this in the past.

Unfortunately increasing it substantially can make the contention on

the

spinlock *substantially* worse.

c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench

-n -M prepared -c$c -j$c -f <(echo "SELECT
pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";)
-P1 -T15

On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload

ran a few

times to ensure WAL is already allocated.

With
NUM_XLOGINSERT_LOCKS = 8: 1459 tps
NUM_XLOGINSERT_LOCKS = 80: 2163 tps

So, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.

(And that is why I'm keen on smaller increase, like upto 64, not 128).

Oops, I swapped the results around when reformatting the results,
sorry! It's
the opposite way. I.e. increasing the locks hurts.

Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS. This is a
slightly different disk (the other seems to have to go the way of the

dodo),

so the results aren't expected to be exactly the same.

NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543

The main reason is that the increase in insert locks puts a lot

more pressure

on the spinlock.

That it addressed by Zhiguo Zhow and me in other thread [1]. But

increasing

NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measured

against it.

I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.

Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot

of cacheline

pingpong.

Waiting is done with LWLockWaitForVar, and there is no wait if

`insertingAt`

is in future. It looks very efficient in master branch code.

But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which

just

iterates over all locks.

Hi, Yura Sokolov

I tested the patch on Hygon C86 7490 64-core using benchmarksql 5.0 with
500 warehouses and 256 terminals run time 10 mins:

| case | min | avg | max |
|--------------------+--------------+--------------+--------------|
| master (4108440) | 891,225.77 | 904,868.75 | 913,708.17 |
| lock 64 | 1,007,716.95 | 1,012,013.22 | 1,018,674.00 |
| lock 64 attempt 1 | 1,016,716.07 | 1,017,735.55 | 1,019,328.36 |
| lock 64 attempt 2 | 1,015,328.31 | 1,018,147.74 | 1,021,513.14 |
| lock 128 | 1,010,147.38 | 1,014,128.11 | 1,018,672.01 |
| lock 128 attempt 1 | 1,018,154.79 | 1,023,348.35 | 1,031,365.42 |
| lock 128 attempt 2 | 1,013,245.56 | 1,018,984.78 | 1,023,696.00 |

I didn't NUM_XLOGINSERT_LOCKS with 16 and 32, however, I tested it with
256,
and got the following error:

2025-01-23 02:23:23.828 CST [333524] PANIC: too many LWLocks taken

I hope this test will be helpful.

--
Regrads,
Japin Li

#7Yura Sokolov
y.sokolov@postgrespro.ru
In reply to: wenhui qiu (#6)
Re: Increase NUM_XLOGINSERT_LOCKS

23.01.2025 08:41, wenhui qiu wrote:

HI Japin
     Thank you for you test ,It seems NUM_XLOGINSERT_LOCKS 64
is great , I think it doesn't need to grow much,What do you think?

I agree: while 128 shows small benefit, it is not as big at the moment.
Given there's other waiting issues (may) arise from increasing it, 64
seems to be sweet spot.

Probably in a future it could be increased more after other places will
be optimized.

Show quoted text

On Thu, Jan 23, 2025 at 10:30 AM Japin Li <japinli@hotmail.com
<mailto:japinli@hotmail.com>> wrote:

On Sat, 18 Jan 2025 at 14:53, Yura Sokolov <y.sokolov@postgrespro.ru
<mailto:y.sokolov@postgrespro.ru>> wrote:

Since it seems Andres missed my request to send answer's copy,
here it is:

On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:

16.01.2025 18:36, Andres Freund пишет:

Hi,

On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:

Good day, hackers.

Zhiguo Zhow proposed to transform xlog reservation to lock-free

     algorighm to

increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]

While I believe lock-free reservation make sense on huge server,

     it is hard

to measure on small servers and personal computers/notebooks.

But increase of NUM_XLOGINSERT_LOCKS have measurable performance

     gain (using

synthetic test) even on my working notebook:

    Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu

24.04

I've experimented with this in the past.

Unfortunately increasing it substantially can make the

contention on the

spinlock *substantially* worse.

c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench

    -n -M prepared -c$c -j$c -f <(echo "SELECT
    pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";)
   -P1 -T15

On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload

    ran a few

times to ensure WAL is already allocated.

With
NUM_XLOGINSERT_LOCKS = 8:       1459 tps
NUM_XLOGINSERT_LOCKS = 80:      2163 tps

So, even in your test you have +50% gain from increasing
NUM_XLOGINSERT_LOCKS.

(And that is why I'm keen on smaller increase, like upto 64, not

128).

Oops, I swapped the results around when reformatting the results,
sorry! It's
the opposite way.  I.e. increasing the locks hurts.

Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS.

This is a

slightly different disk (the other seems to have to go the way of

the dodo),

so the results aren't expected to be exactly the same.

NUM_XLOGINSERT_LOCKS  TPS
1                       2583
2                       2524
4                       2711
8                     2788
16                      1938
32                      1834
64                      1865
128                     1543

The main reason is that the increase in insert locks puts a lot

    more pressure

on the spinlock.

That it addressed by Zhiguo Zhow and me in other thread [1]. But

   increasing

NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
installations), and "lock-free reservation" should be measured

   against it.

I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.

Secondarily it's also that we spend more time iterating
through the insert locks when waiting, and that that causes a lot

    of cacheline

pingpong.

Waiting is done with LWLockWaitForVar, and there is no wait if

   `insertingAt`

is in future. It looks very efficient in master branch code.

But LWLockWaitForVar is called from WaitXLogInsertionsToFinish,

which just

iterates over all locks.

Hi, Yura Sokolov

I tested the patch on Hygon C86 7490 64-core using benchmarksql 5.0 with
500 warehouses and 256 terminals run time 10 mins:

| case               | min          | avg          | max          |
|--------------------+--------------+--------------+--------------|
| master (4108440)   | 891,225.77   | 904,868.75   | 913,708.17   |
| lock 64            | 1,007,716.95 | 1,012,013.22 | 1,018,674.00 |
| lock 64 attempt 1  | 1,016,716.07 | 1,017,735.55 | 1,019,328.36 |
| lock 64 attempt 2  | 1,015,328.31 | 1,018,147.74 | 1,021,513.14 |
| lock 128           | 1,010,147.38 | 1,014,128.11 | 1,018,672.01 |
| lock 128 attempt 1 | 1,018,154.79 | 1,023,348.35 | 1,031,365.42 |
| lock 128 attempt 2 | 1,013,245.56 | 1,018,984.78 | 1,023,696.00 |

I didn't NUM_XLOGINSERT_LOCKS with 16 and 32, however, I tested it
with 256,
and got the following error:

2025-01-23 02:23:23.828 CST [333524] PANIC:  too many LWLocks taken

I hope this test will be helpful.

--
Regrads,
Japin Li

#8Japin Li
japinli@hotmail.com
In reply to: Yura Sokolov (#7)
Re: Increase NUM_XLOGINSERT_LOCKS

On Thu, 23 Jan 2025 at 15:50, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

23.01.2025 08:41, wenhui qiu wrote:

HI Japin
     Thank you for you test ,It seems NUM_XLOGINSERT_LOCKS 64
is great , I think it doesn't need to grow much,What do you think?

I agree: while 128 shows small benefit, it is not as big at the moment.
Given there's other waiting issues (may) arise from increasing it, 64
seems to be sweet spot.

Probably in a future it could be increased more after other places
will be optimized.

+1.
-- 
Regrads,
Japin Li