8.0beta5 results w/ dbt2
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput. The
most significant thing I've noticed was in the oprofile report where
FunctionCall2 and hash_seq_search have moved down the profile a bit.
Also, I have libc with symbols now so we can see what it's doing with
in the oprofile output.
8.0beta5 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.97
8.0beta4 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/
throughput: 3323.07
Mark
Mark Wong wrote:
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput. The
most significant thing I've noticed was in the oprofile report where
FunctionCall2 and hash_seq_search have moved down the profile a bit.Also, I have libc with symbols now so we can see what it's doing with
in the oprofile output.8.0beta5 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.978.0beta4 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/
throughput: 3323.07
You saw an improvement of 23% from beta4 to beta5? I didn't think we
did any major performance changes between those releases. Tom?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Mark Wong <markw@osdl.org> writes:
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput.
Between beta4 and beta5? That's astonishing. We didn't really do very
much that was performance-focused. Digging in the CVS logs, I see only
some changes intended to speed up subtransaction commit, which I suppose
is not relevant to your benchmark, plus these two changes:
2004-11-16 22:13 neilc
* src/backend/access/: hash/hash.c, nbtree/nbtree.c:
Micro-optimization of markpos() and restrpos() in btree and hash
indexes. Rather than using ReadBuffer() to increment the reference
count on an already-pinned buffer, we should use
IncrBufferRefCount() as it is faster and does not require acquiring
the BufMgrLock.
2004-11-09 16:42 tgl
* src/backend/optimizer/util/clauses.c: Allow planner to fold
"stable" functions to constants when forming selectivity estimates,
per recent discussion.
Given the right sort of queries I suppose the second change might create
a significant improvement, but I wasn't expecting 23% on a
general-purpose benchmark...
regards, tom lane
Mark Wong <markw@osdl.org> writes:
I have some initial results using 8.0beta5 with our OLTP workload.
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.97
Do people really only look at the "throughput" numbers? Looking at those
graphs it seems that while most of the OLTP transactions are fulfilled in
subpar response times, there are still significant numbers that take as much
as 30s to fulfil.
Is this just a consequence of the type of queries being tested and the data
distribution? Or is Postgres handling queries that could run consistently fast
but for some reason generating large latencies sometimes?
I'm concerned because in my experience with web sites, once the database
responds slowly for even a small fraction of the requests, the web server
falls behind in handling http requests and a catastrophic failure builds.
It seems to me that reporting maximum, or at least the 95% confidence interval
(95% of queries executed between 50ms-20s) would be more useful than an
overall average.
Personally I would be happier with an average of 200ms but an interval of
100-300ms than an average of 100ms but an interval of 50ms-20s. Consistency
can be more important than sheer speed.
--
greg
On Tue, 2004-11-30 at 04:35, Tom Lane wrote:
Mark Wong <markw@osdl.org> writes:
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput.Between beta4 and beta5? That's astonishing. We didn't really do very
much that was performance-focused. Digging in the CVS logs, I see only
some changes intended to speed up subtransaction commit, which I suppose
is not relevant to your benchmark, plus these two changes:2004-11-16 22:13 neilc
* src/backend/access/: hash/hash.c, nbtree/nbtree.c:
Micro-optimization of markpos() and restrpos() in btree and hash
indexes. Rather than using ReadBuffer() to increment the reference
count on an already-pinned buffer, we should use
IncrBufferRefCount() as it is faster and does not require acquiring
the BufMgrLock.2004-11-09 16:42 tgl
* src/backend/optimizer/util/clauses.c: Allow planner to fold
"stable" functions to constants when forming selectivity estimates,
per recent discussion.Given the right sort of queries I suppose the second change might create
a significant improvement, but I wasn't expecting 23% on a
general-purpose benchmark...
Hmm... well it is a GP benchmark, but the results are based upon the
performance of one transaction while in the presence of the other
workloads. Speed up New Order and the whole thing improves.
If you look at the graph of New Order response time distribution, the
higher result gives much more frequent sub-second response for 8.0beta5
and the hump at around 23secs has moved down to 14secs. Notably, the
payment transaction and stock level transaction have almost identical
response time peaks in both cases. Perhaps some interaction between them
has been slowing us down? Now its gone...
The results seem to be significantly different, so I believe the
results. Well done Mark - great new graphs. Any chance we could see the
graphs showing 0.5 sec bins on the x axis, with all data < 0.5 sec
removed from the graph so we can show the tail? Or point me at the data?
Very pleased....
This shows me one additional thing: we aren't using sufficiently good
instrumentation to understand where the problems lie.
--
Best Regards, Simon Riggs
Mark Wong wrote:
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput. The
most significant thing I've noticed was in the oprofile report where
FunctionCall2 and hash_seq_search have moved down the profile a bit.Also, I have libc with symbols now so we can see what it's doing with
in the oprofile output.8.0beta5 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.978.0beta4 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/
throughput: 3323.07
Is it possible that there are some other differences effecting the result?
At "Table Blocks Read" there is a "history" table in #191, but it does not
show up in #199. Just a thought,... but can you explain, Mark?
Best Regards,
Michael
Greg Stark <gsstark@mit.edu> writes:
Mark Wong <markw@osdl.org> writes:
I have some initial results using 8.0beta5 with our OLTP workload.
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.97
Do people really only look at the "throughput" numbers? Looking at those
graphs it seems that while most of the OLTP transactions are fulfilled in
subpar response times, there are still significant numbers that take as much
as 30s to fulfil.
Is this just a consequence of the type of queries being tested and the data
distribution? Or is Postgres handling queries that could run consistently fast
but for some reason generating large latencies sometimes?
Given the regular shape of the first graph (transactions/minute),
there's every reason to think that the slowdowns are caused by
checkpoint I/O storms. It would be worth the trouble to experiment
with adjusting the bgwriter parameters to even out the flow of write
operations. (I think I've already pointed out that the current defaults
for the bgwriter seem mighty conservative.)
regards, tom lane
On Mon, 2004-11-29 at 16:01 -0800, Mark Wong wrote:
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput. The
most significant thing I've noticed was in the oprofile report where
FunctionCall2 and hash_seq_search have moved down the profile a bit.Also, I have libc with symbols now so we can see what it's doing with
in the oprofile output.8.0beta5 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.978.0beta4 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/
throughput: 3323.07
It appears that WAL archiving (archive_command) is configured in 191 and
disabled (unset) in 199. Perhaps this accounts for some of the
difference?
--
On Tue, Nov 30, 2004 at 07:12:10AM +0000, Simon Riggs wrote:
If you look at the graph of New Order response time distribution, the
higher result gives much more frequent sub-second response for 8.0beta5
and the hump at around 23secs has moved down to 14secs. Notably, the
payment transaction and stock level transaction have almost identical
response time peaks in both cases. Perhaps some interaction between them
has been slowing us down? Now its gone...The results seem to be significantly different, so I believe the
results. Well done Mark - great new graphs. Any chance we could see the
graphs showing 0.5 sec bins on the x axis, with all data < 0.5 sec
removed from the graph so we can show the tail? Or point me at the data?
The data files used to generate the charts are here:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/driver/
For each transaction:
delivery.data
new_order.data
order_status.data
payment.data
stock_level.data
Mark
On Tue, Nov 30, 2004 at 08:34:20AM +0100, Michael Paesold wrote:
Mark Wong wrote:
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput. The
most significant thing I've noticed was in the oprofile report where
FunctionCall2 and hash_seq_search have moved down the profile a bit.Also, I have libc with symbols now so we can see what it's doing with
in the oprofile output.8.0beta5 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.978.0beta4 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/
throughput: 3323.07Is it possible that there are some other differences effecting the result?
At "Table Blocks Read" there is a "history" table in #191, but it does not
show up in #199. Just a thought,... but can you explain, Mark?Best Regards,
Michael
Ah, the chart differences are due to a faulty script. That should be
cleared up and updated shortly. As for other differences to the beta4
test, I ran the beta5 test for 30 minutes as opposed to 60,
random_page_cost=2 instead of 4, and I have archving disabled. I've
generally found the latter two settings to have minimal effects but I
can always reverify.
Mark
On Tue, Nov 30, 2004 at 10:57:02AM -0500, Tom Lane wrote:
Greg Stark <gsstark@mit.edu> writes:
Mark Wong <markw@osdl.org> writes:
I have some initial results using 8.0beta5 with our OLTP workload.
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.97Do people really only look at the "throughput" numbers? Looking at those
graphs it seems that while most of the OLTP transactions are fulfilled in
subpar response times, there are still significant numbers that take as much
as 30s to fulfil.Is this just a consequence of the type of queries being tested and the data
distribution? Or is Postgres handling queries that could run consistently fast
but for some reason generating large latencies sometimes?Given the regular shape of the first graph (transactions/minute),
there's every reason to think that the slowdowns are caused by
checkpoint I/O storms. It would be worth the trouble to experiment
with adjusting the bgwriter parameters to even out the flow of write
operations. (I think I've already pointed out that the current defaults
for the bgwriter seem mighty conservative.)regards, tom lane
I do have bgwriter_delay increased to 10, per previous
recommendation, which did smooth out the throughput graph
considerably. I can continue to adjust those settings.
Mark
On Tue, Nov 30, 2004 at 11:03:03AM -0500, Rod Taylor wrote:
On Mon, 2004-11-29 at 16:01 -0800, Mark Wong wrote:
I have some initial results using 8.0beta5 with our OLTP workload.
Off the bat I see about a 23% improvement in overall throughput. The
most significant thing I've noticed was in the oprofile report where
FunctionCall2 and hash_seq_search have moved down the profile a bit.Also, I have libc with symbols now so we can see what it's doing with
in the oprofile output.8.0beta5 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.978.0beta4 results:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/
throughput: 3323.07It appears that WAL archiving (archive_command) is configured in 191 and
disabled (unset) in 199. Perhaps this accounts for some of the
difference?
I've found the archiving to be about a 1% overhead and in all my
random testing since then I haven't seen evidence otherwise. Here's
a linke to that message:
http://archives.postgresql.org/pgsql-hackers/2004-09/msg00430.php
But I'll try again too as things may have changed.
Mark
Mark Wong <markw@osdl.org> writes:
I do have bgwriter_delay increased to 10, per previous
recommendation, which did smooth out the throughput graph
considerably. I can continue to adjust those settings.
Please try a variety of settings and post your results. It would give
us some hard data to help in deciding what the defaults ought to be.
(I really doubt that the current defaults are reasonable...)
regards, tom lane
Tom,
I do have bgwriter_delay increased to 10, per previous
recommendation, which did smooth out the throughput graph
considerably. I can continue to adjust those settings.Please try a variety of settings and post your results. It would give
us some hard data to help in deciding what the defaults ought to be.
(I really doubt that the current defaults are reasonable...)
Doing a systematic progression test with bgwriter has been on my TODO list for
a while. Unfortunately, the Scalable Test Platform, which is designed for
short runs with no vaccuum isn't the best place for it. Mark and OSDL
allocated us a dedicated machine but I've been tied up with paid work for the
last 4 weeks and unable to finish the setup.
The idea is to have a machine that we can do 4-hour runs, with vacuum, of
different bgwriter, checkpoint, etc. settings. I've also built a 200GB DSS
database for doing some significant testing with DSS workloads after some of
the discussions around shared_buffers in November.
If anyone here has a great desire to write database-backed GUC test scripts
(pref in Perl) then please give me a buzz on the testperf project
(www.pgfoundry.org/testperf).
--
Josh Berkus
Aglio Database Solutions
San Francisco
On Tue, Nov 30, 2004 at 02:00:29AM -0500, Greg Stark wrote:
Mark Wong <markw@osdl.org> writes:
I have some initial results using 8.0beta5 with our OLTP workload.
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/
throughput: 4076.97Do people really only look at the "throughput" numbers? Looking at those
graphs it seems that while most of the OLTP transactions are fulfilled in
subpar response times, there are still significant numbers that take as much
as 30s to fulfil.Is this just a consequence of the type of queries being tested and the data
distribution? Or is Postgres handling queries that could run consistently fast
but for some reason generating large latencies sometimes?I'm concerned because in my experience with web sites, once the database
responds slowly for even a small fraction of the requests, the web server
falls behind in handling http requests and a catastrophic failure builds.It seems to me that reporting maximum, or at least the 95% confidence interval
(95% of queries executed between 50ms-20s) would be more useful than an
overall average.Personally I would be happier with an average of 200ms but an interval of
100-300ms than an average of 100ms but an interval of 50ms-20s. Consistency
can be more important than sheer speed.
Looking at just the throughput number is oversimplying it a bit. The
scale factor (size of the database) limits what your maximum
throughput can be with constraints on think times (delays between
transaction requests) and the number of terminals simulated, which is
also dictated by the size of the database. So given the throughput
with a scale factor (600 in these tests) you can infer whether or not
the response times are reasonable or not. At the 600 warehouse scale
factor, we could theoretically hit about 7200 new-order transactions
per minute. The math is roughly 12 * warehouses.
I do agree that reporting max response times and a confidence
interval (I have been meaning to report a 90th percentile number)
would be informative in addition to a mean. Instead I included the
distribution charts in the mean time...
Mark
On Tue, Nov 30, 2004 at 10:51:42PM +0000, Simon Riggs wrote:
My suggestion: increase checkpoint_timeout to 600 secs, increase
bgwriter parameters also, to reduce how frequently it is called, as well
as increase the number of blocks per cycle.
Ok, here are a series of three tests varying the bgwriter_delay at 1,
50, and 100:
http://www.osdl.org/projects/dbt2dev/results/pgsql/bgwriter_delay/
I also reduced checkpoint_timeout to 600, where it was prevously at
1800 with the results I posted previously. The throughput changes
weren't significant and the oprofile data is more of less the same.
I'll try varying the bgwriter_maxpages next (that's the number of
blocks per cycle, right?)
Mark
Import Notes
Reply to msg id not found: 1101855102.4686.99.camel@localhost.localdomain
Mark,
Ok, here are a series of three tests varying the bgwriter_delay at 1,
50, and 100:
http://www.osdl.org/projects/dbt2dev/results/pgsql/bgwriter_delay/
Hmmm. Looks inconclusive. The differences between the runs are < 0.3%,
which is a margin of error by anyone's definition.
Will have to develop better tests ...
--
Josh Berkus
Aglio Database Solutions
San Francisco
On Mon, 2004-12-06 at 17:43, Josh Berkus wrote:
Mark,
Ok, here are a series of three tests varying the bgwriter_delay at 1,
50, and 100:
http://www.osdl.org/projects/dbt2dev/results/pgsql/bgwriter_delay/Hmmm. Looks inconclusive. The differences between the runs are < 0.3%,
which is a margin of error by anyone's definition.Will have to develop better tests ...
Josh is right - these are inconclusive. That usually means the other
settings are still very sub-optimal.
The graphs show the effect of checkpointing is still very large, so it
looks like the bgwriter is ineffective.
Varying bgwriter_maxpages upwards should take performance higher.
--
Best Regards, Simon Riggs
On Mon, 2004-12-06 at 17:42, Mark Wong wrote:
On Tue, Nov 30, 2004 at 10:51:42PM +0000, Simon Riggs wrote:
My suggestion: increase checkpoint_timeout to 600 secs, increase
bgwriter parameters also, to reduce how frequently it is called, as well
as increase the number of blocks per cycle.Ok, here are a series of three tests varying the bgwriter_delay at 1,
50, and 100:
http://www.osdl.org/projects/dbt2dev/results/pgsql/bgwriter_delay/I also reduced checkpoint_timeout to 600, where it was prevously at
1800 with the results I posted previously. The throughput changes
weren't significant and the oprofile data is more of less the same.
I'll try varying the bgwriter_maxpages next (that's the number of
blocks per cycle, right?)
Mark,
Few questions:
- can we put the logging to DEBUG1 please, so we can see the
checkpoints? ...and set debug_shared_buffers = 10
I don't understand why the checkpoints are so regular at 300 seconds if
the checkpoint_timeout is set to 1800 or other...exactly when and how
are those parameters provided to the server?
- can we set checkpoint_segments to 8192 just to see if that changes the
checkpoint frequency (it should)
- the log output shows the database starts about 4 hours before the main
test starts... err whats going on there? maybe we could get more tests
in if that time could be reduced
- the explain plan output is missing...
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/db/plan0.out.gz
- the log output shows deadlocks occurring - is there something about
the application of DBT-2 which is actually causing contention? Might
that have changed between beta4 and beta5? The earlier lock waits we saw
("Exclusive Lock" thread) are likely to be related to that. Is there
some other artifact of the test that could cause this...random number
generator....etc. My understanding was that TPC-C didn't deadlock, but I
could be wrong there. This could easily be throwing off the test
results... usually to do with the order in which locks are occurring...
if its not, I hope its not a bug,,,
--
Best Regards, Simon Riggs
On Mon, Dec 06, 2004 at 09:28:15PM +0000, Simon Riggs wrote:
Mark,
Few questions:
- can we put the logging to DEBUG1 please, so we can see the
checkpoints? ...and set debug_shared_buffers = 10
Ok, will do.
I don't understand why the checkpoints are so regular at 300 seconds if
the checkpoint_timeout is set to 1800 or other...exactly when and how
are those parameters provided to the server?
I don't think I do either. I always set the parameters by editing the
postgresql.conf file.
- can we set checkpoint_segments to 8192 just to see if that changes the
checkpoint frequency (it should)
Ok.
- the log output shows the database starts about 4 hours before the main
test starts... err whats going on there? maybe we could get more tests
in if that time could be reduced
I start 5000 clients every 3 seconds. I tend to find if I start them
too fast, my client tends to start dropping connections. Maybe a
tcp/ip tuning problem between my client and driver.
- the explain plan output is missing...
http://www.osdl.org/projects/dbt2dev/results/dev4-010/199/db/plan0.out.gz
Ugh, I really do mean to fix that... Something changed so it's not
being captured at all. It really should be easy for me to fix.
- the log output shows deadlocks occurring - is there something about
the application of DBT-2 which is actually causing contention? Might
that have changed between beta4 and beta5? The earlier lock waits we saw
("Exclusive Lock" thread) are likely to be related to that. Is there
some other artifact of the test that could cause this...random number
generator....etc. My understanding was that TPC-C didn't deadlock, but I
could be wrong there. This could easily be throwing off the test
results... usually to do with the order in which locks are occurring...
if its not, I hope its not a bug,,,
Nothing that I can think of. Each thread is initialized with a
different random number seed so we shouldn't see any identicle
transactions occuring because of that.
Mark