Pgbench performance tuning?

Started by Greg Starkover 9 years ago5 messages

stark@mit.edu

over 9 years ago

I'm trying to run pgbench on a moderately beefy machine (4-core 3.4GHz
with 32G of ram and md mirrored spinning rust drives) at scale 300
with 32 clients with duration of 15min. I'm getting TPS numbers
between 60-150 which seems surprisingly low to me and also to several
people on IRC.

Now pg_test_fsync does seem to indicate that's not an unreasonable
commit rate if there was very little commit grouping going on:

Compare file sync methods using one 8kB write:
open_datasync 100.781 ops/sec 9922 usecs/op
fdatasync 71.088 ops/sec 14067 usecs/op

Compare file sync methods using two 8kB writes:
open_datasync 50.286 ops/sec 19886 usecs/op
fdatasync 80.349 ops/sec 12446 usecs/op

And iostat does seem to indicate the drives are ~ 80% utilized with
high write await times So maybe this is just what the system is
capable of with synchronous_commit?

Is anyone really familiar with pg_bench on similar hardware? Are these
numbers reasonable? Any suggestion for how to run it to get the most
realistic measure of Postgres on this machine?

starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 300
query mode: simple
number of clients: 32
number of threads: 4
duration: 900 s
number of transactions actually processed: 109424
latency average: 263.196 ms
tps = 121.464536 (including connections establishing)
tps = 121.464824 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set aid random(1, 100000 * :scale)
0.001 \set bid random(1, 1 * :scale)
0.001 \set tid random(1, 10 * :scale)
0.001 \set delta random(-5000, 5000)
0.229 BEGIN;
12.589 UPDATE pgbench_accounts SET abalance = abalance +
:delta WHERE aid = :aid;
0.280 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
1.442 UPDATE pgbench_tellers SET tbalance = tbalance +
:delta WHERE tid = :tid;
12.435 UPDATE pgbench_branches SET bbalance = bbalance +
:delta WHERE bid = :bid;
0.222 INSERT INTO pgbench_history (tid, bid, aid, delta,
mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
237.623 END;

And the iostat for the period the pg_bench is running:

avg-cpu: %user %nice %system %iowait %steal %idle
1,31 0,00 0,43 20,48 0,00 77,78

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0,00 2,47 14,31 181,13 175,96 2897,77
31,45 80,85 413,63 67,55 440,97 4,16 81,35
sda 0,01 2,47 23,77 181,13 292,04 2897,77
31,14 66,72 325,59 51,08 361,61 3,92 80,40
md0 0,00 0,00 0,05 0,00 0,20 0,00
8,00 0,00 0,00 0,00 0,00 0,00 0,00
md1 0,00 0,00 0,00 0,00 0,00 0,00
0,00 0,00 0,00 0,00 0,00 0,00 0,00
md2 0,00 0,00 38,04 182,79 467,79 2895,73
30,46 0,00 0,00 0,00 0,00 0,00 0,00
md3 0,00 0,00 0,00 0,01 0,00 0,04
8,00 0,00 0,00 0,00 0,00 0,00 0,00

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 9 years ago

In reply to: Greg Stark (#1)

Re: Pgbench performance tuning?

On 2016-08-04 18:48:31 +0100, Greg Stark wrote:

I'm trying to run pgbench on a moderately beefy machine (4-core 3.4GHz
with 32G of ram and md mirrored spinning rust drives) at scale 300
with 32 clients with duration of 15min. I'm getting TPS numbers
between 60-150 which seems surprisingly low to me and also to several
people on IRC.

What's the config? Version? What activity does pidstat -d -l indicate?
How much WAL was generated?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

stark@mit.edu

over 9 years ago

In reply to: Andres Freund (#2)

Re: Pgbench performance tuning?

On Thu, Aug 4, 2016 at 6:53 PM, Andres Freund <andres@anarazel.de> wrote:

What's the config? Version? What activity does pidstat -d -l indicate?
How much WAL was generated?

I know the specifics matter but I was also trying to avoid dumping too
much into the email.

The shared buffers is set to 16384 (128MB). Otherwise it's a default
config (For 9.4 and before I set checkpoint_segments to 32 as well).

Never seen pidstat before but pidstat -d looks like it prints very
similar output to the iostat output I was gathering already. There's
nothing else running on the machine.

I would have to rerun the benchmarks to measure the WAL output. Since
there's no replication and the database shut down cleanly it looks
like it recycled all the log files proactively and the last checkpoint
is in the earliest numbered xlog file.

I'll add checking the xlog position before and after pg_bench to the
script. I also wanted to use run the database under a binary calling
getrusage to report RUSAGE_CHILDREN for the database. It looks like
there's no stock program to do this but it shouldn't be hard to hack
one up.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 9 years ago

In reply to: Greg Stark (#3)

Re: Pgbench performance tuning?

On 2016-08-04 19:15:43 +0100, Greg Stark wrote:

On Thu, Aug 4, 2016 at 6:53 PM, Andres Freund <andres@anarazel.de> wrote:

What's the config? Version? What activity does pidstat -d -l indicate?
How much WAL was generated?

I know the specifics matter but I was also trying to avoid dumping too
much into the email.

The shared buffers is set to 16384 (128MB). Otherwise it's a default
config (For 9.4 and before I set checkpoint_segments to 32 as well).

Well, with 128MB I don't find that a very surprising result. You're
going to push data out to disk constantly. Given the averaged random
access pattern of pgbench that's not really something that interesting.

Never seen pidstat before but pidstat -d looks like it prints very
similar output to the iostat output I was gathering already. There's
nothing else running on the machine.

The question is which backends are doing the IO.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

jeff.janes@gmail.com

over 9 years ago

In reply to: Greg Stark (#1)

Re: Pgbench performance tuning?

On Thu, Aug 4, 2016 at 10:48 AM, Greg Stark <stark@mit.edu> wrote:

I'm trying to run pgbench on a moderately beefy machine (4-core 3.4GHz
with 32G of ram and md mirrored spinning rust drives) at scale 300
with 32 clients with duration of 15min. I'm getting TPS numbers
between 60-150 which seems surprisingly low to me and also to several
people on IRC.

I am assuming "md mirrored" means you have two drives, so there is no
striping. What is their RPM?

Your IO system is inadequate and I'm only slightly surprised at the low TPS.

Now pg_test_fsync does seem to indicate that's not an unreasonable
commit rate if there was very little commit grouping going on:

Compare file sync methods using one 8kB write:
open_datasync 100.781 ops/sec 9922 usecs/op
fdatasync 71.088 ops/sec 14067 usecs/op

Compare file sync methods using two 8kB writes:
open_datasync 50.286 ops/sec 19886 usecs/op
fdatasync 80.349 ops/sec 12446 usecs/op

And iostat does seem to indicate the drives are ~ 80% utilized with
high write await times So maybe this is just what the system is
capable of with synchronous_commit?

As an experiment, turn off synchrounous_commit and see what happens.

Mostly likely you are getting adequate commit grouping behavior, but
that doesn't apply to the table data. Each transaction dirties some
random page in the 3.9GB pgbench_accounts table, and there is no
effective grouping of those writes. That data isn't written
synchronously, but in the steady state it doesn't really matter
because at some point the write rate has to equilibrate with the
dirtying rate. If you can't make the disks faster then the TPS has to
drop to meet them. The most likely mechanism for this to happen is
that the disks are so harried trying to keep up with the dirty data
eviction, that they can't service the sync calls from the commits in a
timely matter. But if you took the sync calls out, the bottleneck
would likely just move somewhere else with only modest overall
improvement.

The way to tune it would be to make shared_buffers large enough that
all of pgbench_accounts fits in it, and increase checkpoint_segments
and checkpoint_timeout as much as you can afford, and increase
checkpoint_completion_target.

Cheers,

Jeff

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers