pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
Hi all,
I've been doing some lightweight load testing with
“pgbench -c8 -j8 -T10”
When run locally on the postgres server I've testing, this gives around
5000tps
When I do it from a server that has a 13ms ping latency, it drops to 37tps.
This is using the default pgbench script, is it to be expected?
If so, why?
cheers,
Chris
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
That seems to be a large drop. On the other hand 13 ms is also like a very large network latency. On LAN your usually in the sub ms area. So going from e.g. 0.2 ms to 13ms is 65 fold decrease. What is the network toplogy like?
Jan
Von meinem iPad gesendet
Am 24.07.2015 um 18:59 schrieb Chris Withers <chris@simplistix.co.uk>:
Hi all,
I've been doing some lightweight load testing with
“pgbench -c8 -j8 -T10”When run locally on the postgres server I've testing, this gives around 5000tps
When I do it from a server that has a 13ms ping latency, it drops to 37tps.
This is using the default pgbench script, is it to be expected?
If so, why?cheers,
Chris
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Am 24.07.2015 um 18:59 schrieb Chris Withers <chris@simplistix.co.uk>:
Hi all,
I've been doing some lightweight load testing with
“pgbench -c8 -j8 -T10”When run locally on the postgres server I've testing, this gives around 5000tps
When I do it from a server that has a 13ms ping latency, it drops to 37tps.
This is using the default pgbench script, is it to be expected?
If so, why?
Am 24.07.2015 um 20:06 schrieb Jan Lentfer <Jan.Lentfer@web.de>:
That seems to be a large drop. On the other hand 13 ms is also like a very large network latency. On LAN your usually in the sub ms area. So going from e.g. 0.2 ms to 13ms is 65 fold decrease. What is the network toplogy like?
Sorry for top posting my first response. Always happens when I am on the iPad.
I just checked on my home setup. Ping latency on GBit crossover connection is around 0.3 ms, while pinging localhost is around 0.05 ms. You are at 13ms. So that is a 260 fold decrease, which is in about the same area as what you see with pgbench. Of course with pgbench the actual payload comes into account on top.
Jan
On 24/07/2015 19:21, Jan Lentfer wrote:
I've been doing some lightweight load testing with
“pgbench -c8 -j8 -T10”When run locally on the postgres server I've testing, this gives
around 5000tpsWhen I do it from a server that has a 13ms ping latency, it drops to
37tps.This is using the default pgbench script, is it to be expected?
If so, why?That seems to be a large drop. On the other hand 13 ms is also like a
very large network latency. On LAN your usually in the sub ms area. So
going from e.g. 0.2 ms to 13ms is 65 fold decrease. What is the
network toplogy like?
10G between two colos, a switch at each end.
What's interesting is how it goes when adjusting the number for
threads/connections. As a baseline, I did one of each. As expected, this
gave around 10 tps, (7 statements in the standard file * 13ms latency
gives around 0.1s per transaction). This behaviour continued up to -c3
-j3 linearly achieving roughly 10tps per client/thread:
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 3
number of threads: 3
duration: 10 s
number of transactions actually processed: 317
latency average: 94.637 ms
tps = 31.494417 (including connections establishing)
tps = 31.668794 (excluding connections establishing)
statement latencies in milliseconds:
0.002016 \set nbranches 1 * :scale
0.000438 \set ntellers 10 * :scale
0.000379 \set naccounts 100000 * :scale
0.000489 \setrandom aid 1 :naccounts
0.000404 \setrandom bid 1 :nbranches
0.000413 \setrandom tid 1 :ntellers
0.000470 \setrandom delta -5000 5000
13.061975 BEGIN;
13.174287 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE
aid = :aid;
13.127691 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
14.552413 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE
tid = :tid;
14.109375 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE
bid = :bid;
13.113028 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
13.256063 END;
Now, at -c4 and -j4 the problem becomes apparent:
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 4
number of threads: 4
duration: 10 s
number of transactions actually processed: 362
latency average: 110.497 ms
tps = 35.912951 (including connections establishing)
tps = 36.111849 (excluding connections establishing)
statement latencies in milliseconds:
0.001917 \set nbranches 1 * :scale
0.000511 \set ntellers 10 * :scale
0.000384 \set naccounts 100000 * :scale
0.000525 \setrandom aid 1 :naccounts
0.000406 \setrandom bid 1 :nbranches
0.000472 \setrandom tid 1 :ntellers
0.000483 \setrandom delta -5000 5000
13.063624 BEGIN;
13.170928 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE
aid = :aid;
13.122450 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
16.532138 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE
tid = :tid;
28.090450 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE
bid = :bid;
13.112207 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
13.229887 END;
We've dropped down to 9 tps per client/thread, and it's the update
statements that are growing in time, here's the worst case:
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 10
duration: 10 s
number of transactions actually processed: 382
latency average: 261.780 ms
tps = 37.310918 (including connections establishing)
tps = 37.517192 (excluding connections establishing)
statement latencies in milliseconds:
0.001798 \set nbranches 1 * :scale
0.000437 \set ntellers 10 * :scale
0.000385 \set naccounts 100000 * :scale
0.000597 \setrandom aid 1 :naccounts
0.000369 \setrandom bid 1 :nbranches
0.000437 \setrandom tid 1 :ntellers
0.000401 \setrandom delta -5000 5000
13.064963 BEGIN;
13.192241 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE
aid = :aid;
13.121914 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
83.994516 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE
tid = :tid;
113.638228 UPDATE pgbench_branches SET bbalance = bbalance + :delta
WHERE bid = :bid;
13.133390 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
13.288079 END;
What on earth could be causing those updates to now take getting on for
an order of magnitude more than the latency to the server?
cheers,
Chris
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, Jul 24, 2015 at 12:13 PM, Chris Withers <chris@simplistix.co.uk>
wrote:
On 24/07/2015 19:21, Jan Lentfer wrote:
I've been doing some lightweight load testing with
“pgbench -c8 -j8 -T10”
When run locally on the postgres server I've testing, this gives
around 5000tpsWhen I do it from a server that has a 13ms ping latency, it drops to
37tps.This is using the default pgbench script, is it to be expected?
If so, why?...
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
This is your problem. There is only one row in the pgbench_branch table,
and every transaction has to update that one row. This is inherently a
seriaized event.
You trying to overcome the latency by cramming more stuff through the pipe
at a time, but what you are cramming through must be done in single-file.
One solution is to just use a large scale on the benchmark so that they
upate random pgbench_branch rows, rather than all updating the same row:
pgbench -i -s50
Alternatively, you could write a custom file so that all 7 commands are
sent down in one packet. That way the releasing COMMIT shows up at the same
time as the locking UPDATE does, rather than having 2 more round trips
between them. But this would violate the spirit of the benchmark, as
presumably you are expected to inspect the results of the SELECT statement
before proceeding with the rest of the transaction.
Or you could write a custom benchmark which more closely resembles whatever
your true workload will be.
Cheers,
Jeff
On 24/07/2015 22:51, Jeff Janes wrote:
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1This is your problem. There is only one row in the pgbench_branch
table, and every transaction has to update that one row. This is
inherently a seriaized event.
Indeed it was!
One solution is to just use a large scale on the benchmark so that
they upate random pgbench_branch rows, rather than all updating the
same row:pgbench -i -s50
With a scale of 1000, everything except the END took roughly the latency
time. Interestingly, the END still seems to take more, when
threads/clients are really ramped up (100 vs 8). Why would that be?
Alternatively, you could write a custom file so that all 7 commands
are sent down in one packet.
How would you restructure the sql so as the make that happen?
cheers,
Chris
On Mon, Jul 27, 2015 at 3:19 AM, Chris Withers <chris@simplistix.co.uk>
wrote:
On 24/07/2015 22:51, Jeff Janes wrote:
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1This is your problem. There is only one row in the pgbench_branch
table, and every transaction has to update that one row. This is
inherently a seriaized event.Indeed it was!
One solution is to just use a large scale on the benchmark so that they
upate random pgbench_branch rows, rather than all updating the same row:pgbench -i -s50
With a scale of 1000, everything except the END took roughly the latency
time. Interestingly, the END still seems to take more, when threads/clients
are really ramped up (100 vs 8). Why would that be?
Could it be the fsync time? Presumably your server has a BBU on it, but
perhaps the transaction rate at the high scale factor is high enough to
overwhelm it. Are you sure you don't see the same timing when you run
pgbench locally?
Alternatively, you could write a custom file so that all 7 commands are
sent down in one packet.
How would you restructure the sql so as the make that happen?
Just make a file with all the commands in it, and then remove all the
newlines from the non-backslash commands so that they are all on the same
line (separated by semicolons). Leave the backslash commands on their own
lines. Then use the -f switch to pgbench, giving it the file you just
made. Also, make sure you give it '-s 1000' as well, because when you use
the -f option pgbench does not auto-detect the scale factor.
Cheers,
Jeff