BUG #17307: Performance deviation between the multiple iterations (NOPM & TPM values).

Started by PG Bug reporting formover 4 years ago3 messagesbugs

noreply@postgresql.org

over 4 years ago

The following bug has been logged on the website:

Bug reference: 17307
Logged by: HPC Researcher
Email address: researcherhpc@gmail.com
PostgreSQL version: 14.0
Operating system: RHEL 8.4
Description:

NOPM values captured with HammerDB-v4.3 scripts (schema_tpcc.tcl and
test_tpcc.tcl ) for multiple trails.
The expected performance deviation between multiple trials should be less
than 2%

Hardware configuration
Architecture x86_64
CPU op-mode(s) 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 256
On-line CPU(s) list: 0-255
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s): 2
NUMA node(s): 8
L1d cache: 32K
L1i cache: 32K
L2 cache: 512K
L3 cache: 16384K
OS: RHEL8.4
RAM SIZE:512
SSD:1TB

Postgresql.conf

autovacuum_max_workers = 16
autovacuum_vacuum_cost_limit = 3000
checkpoint_completion_target = 0.9
checkpoint_timeout = '15min'
cpu_tuple_cost = 0.03
effective_cache_size = '350GB'
listen_addresses = '*'
maintenance_work_mem = '2GB'
max_connections = 1000
max_wal_size = '128GB'
random_page_cost = 1.1
shared_buffers = '128GB'
wal_buffers = '1GB'
work_mem = '128MB'
random_page_cost = 1.1
effective_io_concurrency = 200

HammerDB Scripts

cat schema.tcl

#!/bin/tclsh
dbset db pg
diset connection pg_host localhost
diset connection pg_port 5432
diset tpcc pg_count_ware 400
diset tpcc pg_num_vu 50
print dict
buildschema
waittocomplete

RUN TEST on i.e. start with 1VU then 2, 4 etc
Virtual Users Trail-1(NOPM) Trail-2(NOPM) %diff
12 99390 92913 6.516752
140 561429 525408 6.415949
192 636016 499574 21.4526
230 621644 701882 12.9074

Tom Lane

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: PG Bug reporting form (#1)

Re: BUG #17307: Performance deviation between the multiple iterations (NOPM & TPM values).

PG Bug reporting form <noreply@postgresql.org> writes:

NOPM values captured with HammerDB-v4.3 scripts (schema_tpcc.tcl and
test_tpcc.tcl ) for multiple trails.
The expected performance deviation between multiple trials should be less
than 2%

According to who? Even if you'd provided an easily reproducible
example, I doubt we'd accept this as a bug. Adding more sessions
does not have zero cost.

regards, tom lane

HPC Researcher

researcherhpc@gmail.com

over 4 years ago

In reply to: Tom Lane (#2)

Re: BUG #17307: Performance deviation between the multiple iterations (NOPM & TPM values).

As per HammerDB documentation, the same test running for multiple iterations
in the same Hardware gives less deviation (1%-2%)

We noticed the TPC-C performance(NOPM/TPM) deviation is >2% to 21% with
virtual users(1 to 250 for 2 socket system) on running multiple
iterations(5-6 runs).

Checked on different configurations/ system settings as below :

1.Reduced Max connection i.e., lower connections(example max_connections 1700
to 200 in postgres.conf )

2.Reduced warehouses in schema build i.e. pg_count_ware 800 to pg_count_ware
400/200

3.For each run/iteration rebuild schema(delete schema after results
captured in each iteration and delete/drop tpcc, restart postgres and
rebuild schema for next iteration)

4.For each Iteration unmount and mount /data forlder from SSD.

5.Numa settings like taskset/core pinning and SMT-OFF/SMT-ON.

6 Test run on different NUMA nodes like numactl --interleave=all or
numa auto
balancing.

7.With default PostgreSQL.conf and less virtual users(like
1,2,4,8,12,16,20) and small warehouse like 20 and pg_num_vu 4

8.Run HammerDB in client Machine and PostgreSQL in Master Machine.

Here are the questions:

1. What is the right way to test PostgreSQL with HammerDB for multiple
iterations?

2. Is the performance deviation on multiple runs is expected because of raw
Postgres performance?

3. Can "CPU usage, I/O volume, I/O Latency & HDD/SSD latency" be the reason
for deviation?

Thanks

On Fri, 3 Dec 2021 at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

PG Bug reporting form <noreply@postgresql.org> writes:

NOPM values captured with HammerDB-v4.3 scripts (schema_tpcc.tcl and
test_tpcc.tcl ) for multiple trails.
The expected performance deviation between multiple trials should be less
than 2%

According to who? Even if you'd provided an easily reproducible
example, I doubt we'd accept this as a bug. Adding more sessions
does not have zero cost.

regards, tom lane