Commit Latency
HI Team,
I have a system handling *300 TPS*, with resource usage *below 10%*.
However, I’m noticing *commit latency of around 200ms* for *1% of
transactions*, occasionally spiking to *1 second*. Since there is no
significant *I/O pressure*, I’m trying to identify what else might be
causing this in *PostgreSQL 16*.
Below are the parameters, Version and OS details.
[image: image.png]
Regards,
Ram.
Attachments:
image.pngimage/png; name=image.pngDownload+8-2
I think the first thing you should do is verify the performance of the
storage, particularly how long it takes to do fsync. You didn't explain
what kind of storage or filesystem you're using, so hard to say.
It's not clear to me what "resource usage" means exactly, but the fsync
may be slow even when nothing else competes for I/O.
regards
Tomas
On 2/5/25 11:18, Ramakrishna m wrote:
HI Team,
I have a system handling *300 TPS*, with resource usage *below 10%*.
However, I’m noticing *commit latency of around 200ms* for *1% of
transactions*, occasionally spiking to *1 second*. Since there is no
significant *I/O pressure*, I’m trying to identify what else might be
causing this in *PostgreSQL 16*.Below are the parameters, Version and OS details.
image.png
Regards,
Ram.
--
Tomas Vondra
On Wed, 2025-02-05 at 15:48 +0530, Ramakrishna m wrote:
I’m noticing commit latency of around 200ms for 1% of transactions, occasionally
spiking to 1 second. Since there is no significant I/O pressure, I’m trying to
identify what else might be causing this in PostgreSQL 16.
https://www.cybertec-postgresql.com/en/why-do-i-have-a-slow-commit-in-postgresql/
Yours,
Laurenz Albe
Hello
On 2025-Feb-05, Ramakrishna m wrote:
I have a system handling *300 TPS*, with resource usage *below 10%*.
However, I’m noticing *commit latency of around 200ms* for *1% of
transactions*, occasionally spiking to *1 second*. Since there is no
significant *I/O pressure*, I’m trying to identify what else might be
causing this in *PostgreSQL 16*.
max_connections=8000 doesn't sound great -- how many of those are
active, typically, and how many are idle-in-transaction? And you have
autovacuum_naptime=5s ... which sounds rather dubious. Either somebody
with great expertise configured this very carefully, or the settings are
somewhat randomly chosen with little or no expert oversight. Do you
have monitoring on the amount of bloat on these database? Maybe you
should consider connection pooling and limit the number that are active,
for starters.
Maybe have a look at whether pg_wait_sampling can give you more clues.
Some basic bloat monitoring is a prerequisite to any further performance
tweaking anyhow.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)
Thanks for the suggestions!
It looks the issue is happening at the time of wal creation, does
wal_init_zero off is good option?
Best
On Wed, 5 Feb, 2025, 9:07 pm Álvaro Herrera, <alvherre@alvh.no-ip.org>
wrote:
Show quoted text
Hello
On 2025-Feb-05, Ramakrishna m wrote:
I have a system handling *300 TPS*, with resource usage *below 10%*.
However, I’m noticing *commit latency of around 200ms* for *1% of
transactions*, occasionally spiking to *1 second*. Since there is no
significant *I/O pressure*, I’m trying to identify what else might be
causing this in *PostgreSQL 16*.max_connections=8000 doesn't sound great -- how many of those are
active, typically, and how many are idle-in-transaction? And you have
autovacuum_naptime=5s ... which sounds rather dubious. Either somebody
with great expertise configured this very carefully, or the settings are
somewhat randomly chosen with little or no expert oversight. Do you
have monitoring on the amount of bloat on these database? Maybe you
should consider connection pooling and limit the number that are active,
for starters.Maybe have a look at whether pg_wait_sampling can give you more clues.
Some basic bloat monitoring is a prerequisite to any further performance
tweaking anyhow.--
Álvaro Herrera Breisgau, Deutschland —
https://www.EnterpriseDB.com/
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)
On 2/7/25 14:00, ravi k wrote:
Thanks for the suggestions!
It looks the issue is happening at the time of wal creation, does
wal_init_zero off is good option?
It's impossible to say if that option will help, considering how little
info about the system you provided. I suggest you start by answering the
questions you've been asked by me & Alvaro, explain why you concluded
it's related to WAL creation, etc.
regards
--
Tomas Vondra
Please find the detail below:
Our system is of 4TB(SSD shared pure storage) and *xfs *filesystem:
- Also handling 300 TPS (~3500 DML queries)
- 80% of modified data is BLOB (text)
- Table sizes generally under 70GB
- Unusual naptime configuration due to data characteristics
- Idle in TX expected regularly ~1ms
- Action items are in pipeline:
a. Adjusting vacuum parameters
b. Planning manual vacuum jobs during off-hours
- Note: App team is working on refactoring, but this will take time
Pg_stat_activity o/p:
state | count
---------------------+-------
| 5
active | 9
idle in transaction | 2
idle | 212
(4 rows)
Laurenz, thanks for the article.
Alvaro, regarding pg_wait_sampling, we are unable to enable it at this time
due to the requirement for a server restart.
Tomas, please find the fsync log attached below for your review.Let me know
if you need any further information or clarification.
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 3530.460 ops/sec 283 usecs/op
fdatasync 3477.850 ops/sec 288 usecs/op
fsync 2990.668 ops/sec 334 usecs/op
fsync_writethrough n/a
open_sync 3287.932 ops/sec 304 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 1752.086 ops/sec 571 usecs/op
fdatasync 3198.132 ops/sec 313 usecs/op
fsync 2980.334 ops/sec 336 usecs/op
fsync_writethrough n/a
open_sync 1555.356 ops/sec 643 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
1 * 16kB open_sync write 2898.926 ops/sec 345 usecs/op
2 * 8kB open_sync writes 1631.841 ops/sec 613 usecs/op
4 * 4kB open_sync writes 861.491 ops/sec 1161 usecs/op
8 * 2kB open_sync writes 451.548 ops/sec 2215 usecs/op
16 * 1kB open_sync writes 233.062 ops/sec 4291 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close 2747.990 ops/sec 364 usecs/op
write, close, fsync 2804.104 ops/sec 357 usecs/op
Non-sync'ed 8kB writes:
write 386932.826 ops/sec 3 usecs/op
*We suspect issues with wal_init_zero due to a 3-minute log commit spike
coinciding with increased WAL creation time, as shown in the attached
graphs.*
*Note: Size of the wal file is 1GB and generates wal of 3GB/min*
active | IO: *WALInitWrite *| 00:00:00.249311 | COMMIT
IO | *WALInitSync *| active 474.163 ms| COMMIT
Commit spike timings:
[image: image.png]
Wal creation_timings:
[image: image.png]
Best,
Ravi
On Fri, Feb 7, 2025 at 10:14 PM Tomas Vondra <tomas@vondra.me> wrote:
Show quoted text
On 2/7/25 14:00, ravi k wrote:
Thanks for the suggestions!
It looks the issue is happening at the time of wal creation, does
wal_init_zero off is good option?It's impossible to say if that option will help, considering how little
info about the system you provided. I suggest you start by answering the
questions you've been asked by me & Alvaro, explain why you concluded
it's related to WAL creation, etc.regards
--
Tomas Vondra