Adding a pgbench run to buildfarm

Started by Bort, Paulover 19 years ago16 messages
#1Bort, Paul
pbort@tmwsystems.com

-hackers,

With help from Andrew Dunstan, I'm adding the ability to do a pgbench
run after all of the other tests during a buildfarm run.

Andrew said I should solicit opinions as to what parameters to use. A
cursory search through the archives led me to pick a scaling factor of
10, 5 users, and 100 transactions. All of these will be adjustable using
the build-farm.conf mechanism already in place.

Comments? Suggestions?

Regards,
Paul Bort

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bort, Paul (#1)
Re: Adding a pgbench run to buildfarm

"Bort, Paul" <pbort@tmwsystems.com> writes:

Andrew said I should solicit opinions as to what parameters to use. A
cursory search through the archives led me to pick a scaling factor of
10, 5 users, and 100 transactions.

100 transactions seems barely enough to get through startup transients.
Maybe 1000 would be good.

I think the hard part of this is the reporting process. How do we
track how performance varies over time? It doesn't seem very useful
to compare different buildfarm members, but a longitudinal display of
performance on a single buildfarm machine over time would be cool.
(I'm still missing Mark Wong's daily OSDL performance reports :-()

Actually the $64 question here is whether we trust pgbench as the
standard performance test ...

regards, tom lane

#3Mark Kirkwood
markir@paradise.net.nz
In reply to: Tom Lane (#2)
Re: Adding a pgbench run to buildfarm

Tom Lane wrote:

"Bort, Paul" <pbort@tmwsystems.com> writes:

Andrew said I should solicit opinions as to what parameters to use. A
cursory search through the archives led me to pick a scaling factor of
10, 5 users, and 100 transactions.

100 transactions seems barely enough to get through startup transients.
Maybe 1000 would be good.

Scale factor 10 produces an accounts table of about 130 Mb. Given that
most HW these days has at least 1G of ram, this probably means not much
retrieval IO is tested (only checkpoint and wal fsync). Do we want to
try 100 or even 200? (or recommend scale factor such that size > ram)?

Cheers

Mark

#4Bort, Paul
pbort@tmwsystems.com
In reply to: Tom Lane (#2)
Re: Adding a pgbench run to buildfarm

100 transactions seems barely enough to get through startup
transients.
Maybe 1000 would be good.

OK.

I think the hard part of this is the reporting process. How
do we track how performance varies over time? It doesn't
seem very useful to compare different buildfarm members, but
a longitudinal display of performance on a single buildfarm
machine over time would be cool.
(I'm still missing Mark Wong's daily OSDL performance reports :-()

I was thinking that the output from pgbench would be sent back to the
server and stored somewhere for later analysis.

Actually the $64 question here is whether we trust pgbench as
the standard performance test ...

I think that it's what we've got today, and if tomorrow it gets better,
then the data we get from the buildfarm will improve similarly.

Regards,
Paul Bort

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Kirkwood (#3)
Re: Adding a pgbench run to buildfarm

Mark Kirkwood <markir@paradise.net.nz> writes:

Scale factor 10 produces an accounts table of about 130 Mb. Given that
most HW these days has at least 1G of ram, this probably means not much
retrieval IO is tested (only checkpoint and wal fsync). Do we want to
try 100 or even 200? (or recommend scale factor such that size > ram)?

That gets into a different set of questions, which is what we want the
buildfarm turnaround time to be like. The faster members today produce
a result within 10-15 minutes of pulling their CVS snaps, and I'd be
seriously unhappy if that changed to an hour or three. Maybe we need to
divorce compile/regression tests from performance tests?

regards, tom lane

#6Mark Kirkwood
markir@paradise.net.nz
In reply to: Tom Lane (#5)
Re: Adding a pgbench run to buildfarm

Tom Lane wrote:

Mark Kirkwood <markir@paradise.net.nz> writes:

Scale factor 10 produces an accounts table of about 130 Mb. Given that
most HW these days has at least 1G of ram, this probably means not much
retrieval IO is tested (only checkpoint and wal fsync). Do we want to
try 100 or even 200? (or recommend scale factor such that size > ram)?

That gets into a different set of questions, which is what we want the
buildfarm turnaround time to be like. The faster members today produce
a result within 10-15 minutes of pulling their CVS snaps, and I'd be
seriously unhappy if that changed to an hour or three. Maybe we need to
divorce compile/regression tests from performance tests?

Right - this leads to further questions like, what the performance
testing on the buildfarms is actually for. If it is mainly to catch
regressions introduced by any new code, then scale factor 10 (i.e
essentially in memory testing) may in fact be the best way to show this up.

Cheers

Mark

#7Gavin Sherry
swm@linuxworld.com.au
In reply to: Mark Kirkwood (#6)
Re: Adding a pgbench run to buildfarm

On Mon, 24 Jul 2006, Mark Kirkwood wrote:

Tom Lane wrote:

Mark Kirkwood <markir@paradise.net.nz> writes:

Scale factor 10 produces an accounts table of about 130 Mb. Given that
most HW these days has at least 1G of ram, this probably means not much
retrieval IO is tested (only checkpoint and wal fsync). Do we want to
try 100 or even 200? (or recommend scale factor such that size > ram)?

That gets into a different set of questions, which is what we want the
buildfarm turnaround time to be like. The faster members today produce
a result within 10-15 minutes of pulling their CVS snaps, and I'd be
seriously unhappy if that changed to an hour or three. Maybe we need to
divorce compile/regression tests from performance tests?

Right - this leads to further questions like, what the performance
testing on the buildfarms is actually for. If it is mainly to catch
regressions introduced by any new code, then scale factor 10 (i.e
essentially in memory testing) may in fact be the best way to show this up.

It introduces a problem though. Not all machines stay the same over time.
A machine may by upgraded, a machine may be getting backed up or may in
some other way be utilised during a performance test. This would skew the
stats for that machine. It may confuse people more than help them...

At the very least, the performance figures would need to be accompanied by
details of what other processes were running and what resources they were
chewing during the test.

This is what was nice about the OSDL approach. Each test was preceeded by
an automatic reinstall of the OS and the machines were specifically for
testing. The tester had complete control.

We could perhaps mimic some of that using virtualisation tools which
control access to system resources but it wont work on all platforms. The
problem is that it probably introduces a new variable, in that I'm not
sure that virtualisation software can absolutely limit CPU resources a
particular container has. That is, you might not be able to get
reproducible runs with the same code. :(

Just some thoughts.

Thanks,

Gavin

#8Dave Page
dpage@vale-housing.co.uk
In reply to: Bort, Paul (#1)
Re: Adding a pgbench run to buildfarm

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bort, Paul
Sent: 24 July 2006 04:52
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Adding a pgbench run to buildfarm

-hackers,

With help from Andrew Dunstan, I'm adding the ability to do a pgbench
run after all of the other tests during a buildfarm run.

Andrew said I should solicit opinions as to what parameters to use. A
cursory search through the archives led me to pick a scaling factor of
10, 5 users, and 100 transactions. All of these will be
adjustable using
the build-farm.conf mechanism already in place.

Comments? Suggestions?

Please ensure the run is optional. The machine hosting Snake and
Bandicoot is currently running 16 builds a day, and I'd prefer not to
significantly add to it's load.

Regards, Dave.

#9Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Mark Kirkwood (#3)
Re: Adding a pgbench run to buildfarm

Mark Kirkwood wrote:

Tom Lane wrote:

"Bort, Paul" <pbort@tmwsystems.com> writes:

Andrew said I should solicit opinions as to what parameters to use. A
cursory search through the archives led me to pick a scaling factor of
10, 5 users, and 100 transactions.

100 transactions seems barely enough to get through startup transients.
Maybe 1000 would be good.

Scale factor 10 produces an accounts table of about 130 Mb. Given that
most HW these days has at least 1G of ram, this probably means not much
retrieval IO is tested (only checkpoint and wal fsync). Do we want to
try 100 or even 200? (or recommend scale factor such that size > ram)?

hmm - that "1GB" is a rather optimistic estimate for most of the
buildfarm boxes(mine at least).
Out of the 6 ones I have - only one that actually has much RAM
(allocated) and lionfish for example is rather resource starved at only
48(!) MB of RAM and very limited diskspace - which has been plenty
enough until now doing the builds (with enough swap of course).
I supposed that anything that would result in additional diskspace usage
in excess of maybe 150MB or so would run it out of resources :-(

I'm also not too keen on running excessivly long pgbench runs on some of
the buildfarm members so I would prefer to make that one optional.

Stefan

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Dave Page (#8)
Re: Adding a pgbench run to buildfarm

Dave Page wrote:

With help from Andrew Dunstan, I'm adding the ability to do a pgbench
run after all of the other tests during a buildfarm run.

Please ensure the run is optional. The machine hosting Snake and
Bandicoot is currently running 16 builds a day, and I'd prefer not to
significantly add to it's load.

Rest easy. It will be optional, of course.

cheers

andrew

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Gavin Sherry (#7)
Re: Adding a pgbench run to buildfarm

Gavin Sherry wrote:

Not all machines stay the same over time.
A machine may by upgraded, a machine may be getting backed up or may in
some other way be utilised during a performance test. This would skew the
stats for that machine. It may confuse people more than help them...

At the very least, the performance figures would need to be accompanied by
details of what other processes were running and what resources they were
chewing during the test.

This is what was nice about the OSDL approach. Each test was preceeded by
an automatic reinstall of the OS and the machines were specifically for
testing. The tester had complete control.

We could perhaps mimic some of that using virtualisation tools which
control access to system resources but it wont work on all platforms. The
problem is that it probably introduces a new variable, in that I'm not
sure that virtualisation software can absolutely limit CPU resources a
particular container has. That is, you might not be able to get
reproducible runs with the same code. :(

We are really not going to go in this direction. If you want ideal
performance tests then a heterogenous distributed collection of
autonomous systems like buildfarm is not what you want.

You are going to have to live with the fatc that there will be
occasional, possibly even frequent, blips in the data due to other
activity on the machine.

If you want tightly controlled or very heavy load testing this is the
wrong vehicle.

You might think that what that leaves us is not worth having - the
consensus in Toronto seemed to be that it is worth having, which is why
it is being pursued.

cheers

andrew

#12Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: Adding a pgbench run to buildfarm

Tom Lane wrote:

Mark Kirkwood <markir@paradise.net.nz> writes:

Scale factor 10 produces an accounts table of about 130 Mb. Given that
most HW these days has at least 1G of ram, this probably means not much
retrieval IO is tested (only checkpoint and wal fsync). Do we want to
try 100 or even 200? (or recommend scale factor such that size > ram)?

That gets into a different set of questions, which is what we want the
buildfarm turnaround time to be like. The faster members today produce
a result within 10-15 minutes of pulling their CVS snaps, and I'd be
seriously unhappy if that changed to an hour or three. Maybe we need to
divorce compile/regression tests from performance tests?

We could have the system report build/regression results before going on
to do performance testing. I don't want to divorce them altogether if I
can help it, as it will make cleanup a lot messier.

cheers

andrew

#13Bort, Paul
pbort@tmwsystems.com
In reply to: Andrew Dunstan (#11)
Re: Adding a pgbench run to buildfarm

Andrew Dunstan wrote:

We are really not going to go in this direction. If you want ideal
performance tests then a heterogenous distributed collection of
autonomous systems like buildfarm is not what you want.

You are going to have to live with the fatc that there will be
occasional, possibly even frequent, blips in the data due to other
activity on the machine.

If you want tightly controlled or very heavy load testing this is the
wrong vehicle.

You might think that what that leaves us is not worth having - the
consensus in Toronto seemed to be that it is worth having,
which is why
it is being pursued.

I wasn't at the conference, but the impression I'm under is that the
point of this isn't to catch a change that causes a 1% slowdown; the
point is to catch much larger problems, probably 20% slowdown or more.

Given the concerns about running this on machines that don't have a lot
of CPU and disk to spare, should it ship disabled?

Andrew, what do you think of pgbench reports shipping separately? I have
no idea how the server end is set up, so I don't know how much of a pain
that would be.

Regards,
Paul Bort

P.S. My current thought for settings is scaling factor 10, users 5,
transactions 1000.

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Bort, Paul (#13)
Re: Adding a pgbench run to buildfarm

Bort, Paul wrote:

Given the concerns about running this on machines that don't have a lot
of CPU and disk to spare, should it ship disabled?

Yes, certainly.

Andrew, what do you think of pgbench reports shipping separately? I have
no idea how the server end is set up, so I don't know how much of a pain
that would be.

Well, we'll need to put in some changes to collect the data, certainly.
I don't see why we shouldn't ship the pgbench result separately, but ...

P.S. My current thought for settings is scaling factor 10, users 5,
transactions 1000.

... at this size it's hardly worth it. A quick test on my laptop showed
this taking about a minute for the setup and another minute for the run,
Unless we scale way beyond this I don't see any point in separate reporting.

cheers

andrew

#15Jim C. Nasby
jnasby@pervasive.com
In reply to: Bort, Paul (#1)
Re: Adding a pgbench run to buildfarm

On Sun, Jul 23, 2006 at 11:52:14PM -0400, Bort, Paul wrote:

-hackers,

With help from Andrew Dunstan, I'm adding the ability to do a pgbench
run after all of the other tests during a buildfarm run.

Andrew said I should solicit opinions as to what parameters to use. A
cursory search through the archives led me to pick a scaling factor of
10, 5 users, and 100 transactions. All of these will be adjustable using
the build-farm.conf mechanism already in place.

Why is it being hard-coded? I think it makes a lot more sense to allow
pg_bench options to be specified in the buildfarm config. Even better
yet would be specifying them on the command line, which would allow
members to run a more rigorous test once a day/week (I'm thinking one
that might take 30 minutes, which could well ferret out some issues that
a simple 5 minute test won't).
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#16Bort, Paul
pbort@tmwsystems.com
In reply to: Jim C. Nasby (#15)
Re: Adding a pgbench run to buildfarm

Jim Nasby wrote:

Why is it being hard-coded? I think it makes a lot more sense to allow
pg_bench options to be specified in the buildfarm config. Even better
yet would be specifying them on the command line, which would allow
members to run a more rigorous test once a day/week (I'm thinking one
that might take 30 minutes, which could well ferret out some
issues that
a simple 5 minute test won't).

They absolutely won't be hard-coded. I'm asking for values to use as
defaults in the config file.

Also allowing command-line parameters is interesting, but I think we
should wait on it until the initial version is in place.