Simulating Clog Contention

Started by Simon Riggsover 14 years ago25 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

In order to simulate real-world clog contention, we need to use
benchmarks that deal with real world situations.

Currently, pgbench pre-loads data using COPY and executes a VACUUM so
that all hint bits are set on every row of every page of every table.
Thus, as pgbench runs it sees zero clog accesses from historical data.
As a result, clog access is minimised and the effects of clog
contention in the real world go unnoticed.

The following patch adds a pgbench option -I to load data using
INSERTs, so that we can begin benchmark testing with rows that have
large numbers of distinct un-hinted transaction ids. With a database
pre-created using this we will be better able to simulate and thus
more easily measure clog contention. Note that current clog has space
for 1 million xids, so a scale factor of greater than 10 is required
to really stress the clog.

The patch uses multiple connections to load data using a predefined
script similar to the -N or -S logic.

$ pgbench --help
pgbench is a benchmarking tool for PostgreSQL.

Usage:
pgbench [OPTIONS]... [DBNAME]

Initialization options:
-i invokes initialization mode using COPY
-I invokes initialization mode using INSERTs
...

$ pgbench -I -c 4 -t 10000
creating tables...
filling accounts table with 100000 rows using inserts
set primary key...
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pgbench_branches_pkey" for table "pgbench_branches"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pgbench_tellers_pkey" for table "pgbench_tellers"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pgbench_accounts_pkey" for table "pgbench_accounts"
done.
transactions option ignored
transaction type: Load pgbench_accounts using INSERTs
scaling factor: 1
query mode: simple
number of clients: 4
number of threads: 1
number of transactions per client: 25000
number of transactions actually processed: 100000/100000
tps = 828.194854 (including connections establishing)
tps = 828.440330 (excluding connections establishing)

Yes, my laptop really is that slow. Contributions to improve that
situation gratefully received.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pgbench_clog_contention_preload.v1.patchtext/x-patch; charset=US-ASCII; name=pgbench_clog_contention_preload.v1.patchDownload+111-38
#2Cédric Villemain
cedric.villemain.debian@gmail.com
In reply to: Simon Riggs (#1)
Re: Simulating Clog Contention

$ pgbench --help
pgbench is a benchmarking tool for PostgreSQL.

Usage:
 pgbench [OPTIONS]... [DBNAME]

Initialization options:
 -i           invokes initialization mode using COPY
 -I           invokes initialization mode using INSERTs

sounds usefull.

what about a long extra option: --inserts like pg_dump ?
pgbench -i --inserts ...

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#1)
Re: Simulating Clog Contention

On 12.01.2012 14:31, Simon Riggs wrote:

In order to simulate real-world clog contention, we need to use
benchmarks that deal with real world situations.

Currently, pgbench pre-loads data using COPY and executes a VACUUM so
that all hint bits are set on every row of every page of every table.
Thus, as pgbench runs it sees zero clog accesses from historical data.
As a result, clog access is minimised and the effects of clog
contention in the real world go unnoticed.

The following patch adds a pgbench option -I to load data using
INSERTs, so that we can begin benchmark testing with rows that have
large numbers of distinct un-hinted transaction ids. With a database
pre-created using this we will be better able to simulate and thus
more easily measure clog contention. Note that current clog has space
for 1 million xids, so a scale factor of greater than 10 is required
to really stress the clog.

No doubt this is handy for testing this particular area, but overall I
feel this is too much of a one-trick pony to include in pgbench.

Alternatively, you could do something like this:

do $$
declare
i int4;
naccounts int4;
begin
select count(*) into naccounts from pgbench_accounts;
for i in 1..naccounts loop
-- use a begin-exception block to create a new subtransaction
begin
update pgbench_accounts set abalance = abalance where aid = i;
exception
when division_by_zero then raise 'unexpected division by zero
error'; end;
end loop;
end;
$$;

after initializing the pgbench database, to assign distinct xmins to all
rows. Another idea would be to run pg_dump in --inserts mode, edit the
dump to remove BEGIN/COMMIT from it, and restore it back.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In reply to: Heikki Linnakangas (#3)
Re: Simulating Clog Contention

On 19 January 2012 14:36, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

No doubt this is handy for testing this particular area, but overall I feel
this is too much of a one-trick pony to include in pgbench.

I don't think that being conservative in accepting pgbench options is
the right way to go. It's already so easy for a non-expert to shoot
themselves in the foot that we don't do ourselves any favours by
carefully weighing the merits of an expert-orientated feature.

Have you ever read the man page for rsync? It's massive, with a huge
number of options, and rsync is supposed to be a tool that's widely
used by sysadmins, not a specialist database benchmarking tool.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#3)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 2:36 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 12.01.2012 14:31, Simon Riggs wrote:

In order to simulate real-world clog contention, we need to use
benchmarks that deal with real world situations.

Currently, pgbench pre-loads data using COPY and executes a VACUUM so
that all hint bits are set on every row of every page of every table.
Thus, as pgbench runs it sees zero clog accesses from historical data.
As a result, clog access is minimised and the effects of clog
contention in the real world go unnoticed.

The following patch adds a pgbench option -I to load data using
INSERTs, so that we can begin benchmark testing with rows that have
large numbers of distinct un-hinted transaction ids. With a database
pre-created using this we will be better able to simulate and thus
more easily measure clog contention. Note that current clog has space
for 1 million xids, so a scale factor of greater than 10 is required
to really stress the clog.

No doubt this is handy for testing this particular area, but overall I feel
this is too much of a one-trick pony to include in pgbench.

Alternatively, you could do something like this:

I think the one-trick pony is pgbench. It has exactly one starting
condition for its tests and that isn't even a real world condition.

The main point of including the option into pgbench is to have a
utility that produces as initial test condition that works the same
for everyone, so we can accept each others benchmark results. We both
know that if someone posts that they have done $RANDOMSQL on a table
before running a test, it will just be ignored and people will say
user error. Some people will get it wrong when reproducing things and
we'll have chaos.

The patch exists as a way of testing the clog contention improvement
patches and provides a route to long term regression testing that the
solution(s) still work.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#6Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#5)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 10:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, Jan 19, 2012 at 2:36 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 12.01.2012 14:31, Simon Riggs wrote:

In order to simulate real-world clog contention, we need to use
benchmarks that deal with real world situations.

Currently, pgbench pre-loads data using COPY and executes a VACUUM so
that all hint bits are set on every row of every page of every table.
Thus, as pgbench runs it sees zero clog accesses from historical data.
As a result, clog access is minimised and the effects of clog
contention in the real world go unnoticed.

The following patch adds a pgbench option -I to load data using
INSERTs, so that we can begin benchmark testing with rows that have
large numbers of distinct un-hinted transaction ids. With a database
pre-created using this we will be better able to simulate and thus
more easily measure clog contention. Note that current clog has space
for 1 million xids, so a scale factor of greater than 10 is required
to really stress the clog.

No doubt this is handy for testing this particular area, but overall I feel
this is too much of a one-trick pony to include in pgbench.

Alternatively, you could do something like this:

I think the one-trick pony is pgbench. It has exactly one starting
condition for its tests and that isn't even a real world condition.

The main point of including the option into pgbench is to have a
utility that produces as initial test condition that works the same
for everyone, so we can accept each others benchmark results. We both
know that if someone posts that they have done $RANDOMSQL on a table
before running a test, it will just be ignored and people will say
user error. Some people will get it wrong when reproducing things and
we'll have chaos.

The patch exists as a way of testing the clog contention improvement
patches and provides a route to long term regression testing that the
solution(s) still work.

I agree: I think this is useful.

However, I think we should follow the precedent of some of the other
somewhat-obscure options we've added recently and have only a long
form option for this: --inserts.

Also, I don't think the behavior described here should be joined at
the hip to --inserts:

+	 * We do this after a load by COPY, but before a load via INSERT
+	 *
+	 * This is done deliberately to ensure that no heap or index hints are
+	 * set before we start running the benchmark. This emulates the case
+	 * where data has arrived row at a time by INSERT, rather than being
+	 * bulkloaded prior to update.

I think that's also a useful behavior, but if we're going to have it,
we should have a separate option for it, like --create-indexes-early.
Otherwise, someone who wants to (for example) test only the impact of
doing inserts vs. COPY will get misleading answers.

Finally, it's occurred to me that it would be useful to make pgbench
respect -n even in initialization mode, and the SGML doc changes imply
that this patch does that somewhere or other, but maybe only when
you're doing INSERTS and not when you're doing copy, which would be
odd; and there's no sgml doc update for -n, and no command-line help
change either.

In short, I think the reason this patch seems like it's implementing
something fairly arbitrary it's really three pretty good ideas that
are somewhat jumbled together.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#6)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 3:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I agree: I think this is useful.

However, I think we should follow the precedent of some of the other
somewhat-obscure options we've added recently and have only a long
form option for this: --inserts.

Yep, no problem.

Also, I don't think the behavior described here should be joined at
the hip to --inserts:

+        * We do this after a load by COPY, but before a load via INSERT
+        *
+        * This is done deliberately to ensure that no heap or index hints are
+        * set before we start running the benchmark. This emulates the case
+        * where data has arrived row at a time by INSERT, rather than being
+        * bulkloaded prior to update.

I think that's also a useful behavior, but if we're going to have it,
we should have a separate option for it, like --create-indexes-early.
Otherwise, someone who wants to (for example) test only the impact of
doing inserts vs. COPY will get misleading answers.

Creating indexes later would invalidate the test conditions I was
trying to create, so that doesn't add a useful new initialisation
mode. (Creating the indexes causes all of the hint bits to be set).

So that's just adding unrelated requirements for additional tests.
Yes, there are lots of additional tests we could get this code to
perform but we don't need to burden this patch with responsibility for
adding them, especially not when the tests mentioned don't refer to
any related patches in this commit fest and could be done at any time.
Any such change is clearly very low priority at this time.

Finally, it's occurred to me that it would be useful to make pgbench
respect -n even in initialization mode, and the SGML doc changes imply
that this patch does that somewhere or other, but maybe only when
you're doing INSERTS and not when you're doing copy, which would be
odd; and there's no sgml doc update for -n, and no command-line help
change either.

I'll fix the docs.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#8Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#7)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 10:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Also, I don't think the behavior described here should be joined at
the hip to --inserts:

+        * We do this after a load by COPY, but before a load via INSERT
+        *
+        * This is done deliberately to ensure that no heap or index hints are
+        * set before we start running the benchmark. This emulates the case
+        * where data has arrived row at a time by INSERT, rather than being
+        * bulkloaded prior to update.

I think that's also a useful behavior, but if we're going to have it,
we should have a separate option for it, like --create-indexes-early.
Otherwise, someone who wants to (for example) test only the impact of
doing inserts vs. COPY will get misleading answers.

Creating indexes later would invalidate the test conditions I was
trying to create, so that doesn't add a useful new initialisation
mode. (Creating the indexes causes all of the hint bits to be set).

Right, but the point is that to address Heikki's objection that this
is a special-purpose hack, we should try to make it general, so that
it can be used by other people for other things. For example, if the
options are separated, you can use this to measure how much slower
--inserts vs. the regular way. But if that also changes the way
indexes are created, then you can't. Moreover, since the
documentation mentioned only one of those two changes and not the
other, you might reasonably think that you've conducted a valid test.
We could document that --inserts changes the behavior in multiple
ways, but then the switch will end up being a bit of a misnomer, so I
think it's better to have a separate switch for each behavior someone
might want.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#8)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 4:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 19, 2012 at 10:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Also, I don't think the behavior described here should be joined at
the hip to --inserts:

+        * We do this after a load by COPY, but before a load via INSERT
+        *
+        * This is done deliberately to ensure that no heap or index hints are
+        * set before we start running the benchmark. This emulates the case
+        * where data has arrived row at a time by INSERT, rather than being
+        * bulkloaded prior to update.

I think that's also a useful behavior, but if we're going to have it,
we should have a separate option for it, like --create-indexes-early.
Otherwise, someone who wants to (for example) test only the impact of
doing inserts vs. COPY will get misleading answers.

Creating indexes later would invalidate the test conditions I was
trying to create, so that doesn't add a useful new initialisation
mode. (Creating the indexes causes all of the hint bits to be set).

Right, but the point is that to address Heikki's objection that this
is a special-purpose hack, we should try to make it general, so that
it can be used by other people for other things.

This supports running hundreds of different tests because it creates a
useful general starting condition. It's not a special purpose hack
because its not a hack, nor is it special purpose.

Yes, we could have an option to run with no indexes. Or we could have
an option to run with 2 indexes as well. We could do all sorts of
things. None of that is important, because there aren't any patches in
the queue that need those tests and its too late to do it in this
release. And if it really is important you can do it in the next
release.

If we have time to spend we should be spending it on running the patch
to test the effectiveness of other patches in the queue, not on
inventing new tests that have no relevance.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#10Marti Raudsepp
marti@juffo.org
In reply to: Robert Haas (#8)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 18:12, Robert Haas <robertmhaas@gmail.com> wrote:

Right, but the point is that to address Heikki's objection that this
is a special-purpose hack, we should try to make it general, so that
it can be used by other people for other things.

Personally I would like to see support for more flexibility in pgbench
scripts. It would be useful to allow scripts to contain custom
initialization sections -- for scripts that want a custom schema, as
well as different ways to populate the standard schema.

Regards,
Marti

#11Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#9)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 11:46 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, Jan 19, 2012 at 4:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 19, 2012 at 10:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Also, I don't think the behavior described here should be joined at
the hip to --inserts:

+        * We do this after a load by COPY, but before a load via INSERT
+        *
+        * This is done deliberately to ensure that no heap or index hints are
+        * set before we start running the benchmark. This emulates the case
+        * where data has arrived row at a time by INSERT, rather than being
+        * bulkloaded prior to update.

I think that's also a useful behavior, but if we're going to have it,
we should have a separate option for it, like --create-indexes-early.
Otherwise, someone who wants to (for example) test only the impact of
doing inserts vs. COPY will get misleading answers.

Creating indexes later would invalidate the test conditions I was
trying to create, so that doesn't add a useful new initialisation
mode. (Creating the indexes causes all of the hint bits to be set).

Right, but the point is that to address Heikki's objection that this
is a special-purpose hack, we should try to make it general, so that
it can be used by other people for other things.

This supports running hundreds of different tests because it creates a
useful general starting condition. It's not a special purpose hack
because its not a hack, nor is it special purpose.

Yes, we could have an option to run with no indexes. Or we could have
an option to run with 2 indexes as well. We could do all sorts of
things. None of that is important, because there aren't any patches in
the queue that need those tests and its too late to do it in this
release. And if it really is important you can do it in the next
release.

If we have time to spend we should be spending it on running the patch
to test the effectiveness of other patches in the queue, not on
inventing new tests that have no relevance.

I feel I've adequate explained why it makes sense to me to separate
those options. If you want, I'll do the work myself; it will take
less time than arguing about it.

On the other hand, if you wish to insist that we should only commit
this patch if --inserts makes multiple, unrelated, undocumented
changes to the initial test configurations, then I'll join Heikki in
voting against this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#11)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 5:47 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I feel I've adequate explained why it makes sense to me to separate
those options.  If you want, I'll do the work myself; it will take
less time than arguing about it.

If you have time to contribute, please use the patch as stands to test
the other patches in the CF queue.

It's more important that we measure and fix clog contention than have
a new pgbench feature with no immediate value to the next release of
Postgres.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#13Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#12)
Re: Simulating Clog Contention

On Thu, Jan 19, 2012 at 1:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, Jan 19, 2012 at 5:47 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I feel I've adequate explained why it makes sense to me to separate
those options.  If you want, I'll do the work myself; it will take
less time than arguing about it.

If you have time to contribute, please use the patch as stands to test
the other patches in the CF queue.

Those things aren't mutually exclusive; whether or not I spend an hour
whacking this patch around isn't going to have any impact on how much
benchmarking I get done. Benchmarking is mostly waiting, and I can
do other things while the tests are going.

Just to reiterate a point I've made previously, Nate Boley's test
machine was running another big job for several weeks straight, and I
haven't been able to use the machine for anything. It seems to be
unloaded at the moment so I'll try to squeeze in some tests, but I
don't know how long it will stay that way. It's been great to have
nearly unimpeded access to this for most of the cycle, but all good
things must (and do) come to an end. In any event, none of this has
much impact on the offer above, which is a small amount of code that I
will be happy to attend to if you do not wish to do so.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Simon Riggs (#1)
Re: Simulating Clog Contention

At 2012-01-12 12:31:20 +0000, simon@2ndQuadrant.com wrote:

The following patch adds a pgbench option -I to load data using
INSERTs

This is just to confirm that the patch applies and builds and works
fine (though of course it does take a long time… pity there doesn't
seem to be any easy way to add progress indication like -i has).

I'm aware of the subsequent discussion about using only a long option,
documenting -n, and adding a knob to control index creation timing. I
don't have a useful opinion on any of those things. It's just that the
patch was marked "Needs review" and it was only while waiting for 100k
inserts to run that I thought of checking if there was any discussion
about it. Oops.

Yes, my laptop really is that slow.

It appears to be eight times as fast as mine.

-- ams

#15Merlin Moncure
mmoncure@gmail.com
In reply to: Abhijit Menon-Sen (#14)
Re: Simulating Clog Contention

On Thu, Jan 26, 2012 at 8:18 AM, Abhijit Menon-Sen <ams@toroid.org> wrote:

This is just to confirm that the patch applies and builds and works
fine (though of course it does take a long time… pity there doesn't
seem to be any easy way to add progress indication like -i has).

On any non server grade hardware you'd probably want to disable
synchronous_commit while loading. FWIW, this is a great addition to
pgbench.

merlin

#16Robert Haas
robertmhaas@gmail.com
In reply to: Merlin Moncure (#15)
Re: Simulating Clog Contention

On Thu, Jan 26, 2012 at 11:41 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Thu, Jan 26, 2012 at 8:18 AM, Abhijit Menon-Sen <ams@toroid.org> wrote:

This is just to confirm that the patch applies and builds and works
fine (though of course it does take a long time… pity there doesn't
seem to be any easy way to add progress indication like -i has).

On any non server grade hardware you'd probably want to disable
synchronous_commit while loading.  FWIW, this is a great addition to
pgbench.

Do you object to separating out the three different things the patch
does and adding separate options for each one? If so, why? I find
them independently useful.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#17Merlin Moncure
mmoncure@gmail.com
In reply to: Robert Haas (#16)
Re: Simulating Clog Contention

On Thu, Jan 26, 2012 at 10:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 26, 2012 at 11:41 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Thu, Jan 26, 2012 at 8:18 AM, Abhijit Menon-Sen <ams@toroid.org> wrote:

This is just to confirm that the patch applies and builds and works
fine (though of course it does take a long time… pity there doesn't
seem to be any easy way to add progress indication like -i has).

On any non server grade hardware you'd probably want to disable
synchronous_commit while loading.  FWIW, this is a great addition to
pgbench.

Do you object to separating out the three different things the patch
does and adding separate options for each one?  If so, why?  I find
them independently useful.

I didn't take a position on that -- although superficially it seems
like more granular control is good (and you can always group options
together with a 'super option' like as in cp -a) -- just making a
general comment on the usefulness of testing against records that
don't have the same xid.

merlin

#18Jeff Janes
jeff.janes@gmail.com
In reply to: Simon Riggs (#1)
Re: Simulating Clog Contention

On Thu, Jan 12, 2012 at 4:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

The following patch adds a pgbench option -I to load data using
INSERTs, so that we can begin benchmark testing with rows that have
large numbers of distinct un-hinted transaction ids. With a database
pre-created using this we will be better able to simulate and thus
more easily measure clog contention. Note that current clog has space
for 1 million xids, so a scale factor of greater than 10 is required
to really stress the clog.

Running with this patch with a non-default scale factor generates the
spurious notice:

"Scale option ignored, using pgbench_branches table count = 10"

In fact the scale option is not being ignored, because it was used to
initialize the pgbench_branches table count earlier in this same
invocation.

I think that even in normal (non-initialization) usage, this message
should be suppressed when the provided scale factor
is equal to the pgbench_branches table count.

Cheers,

Jeff

#19Jeff Janes
jeff.janes@gmail.com
In reply to: Jeff Janes (#18)
Re: Simulating Clog Contention

On Fri, Jan 27, 2012 at 1:45 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Thu, Jan 12, 2012 at 4:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

The following patch adds a pgbench option -I to load data using
INSERTs, so that we can begin benchmark testing with rows that have
large numbers of distinct un-hinted transaction ids. With a database
pre-created using this we will be better able to simulate and thus
more easily measure clog contention. Note that current clog has space
for 1 million xids, so a scale factor of greater than 10 is required
to really stress the clog.

Running with this patch with a non-default scale factor generates the
spurious notice:

"Scale option ignored, using pgbench_branches table count = 10"

In fact the scale option is not being ignored, because it was used to
initialize the pgbench_branches table count earlier in this same
invocation.

I think that even in normal (non-initialization) usage, this message
should be suppressed when the provided scale factor
is equal to the pgbench_branches table count.

The attached patch does just that. There is probably no reason to
warn people that we are doing what they told us to, but not for the
reason they think.

I think this change makes sense regardless of the disposition of the
thread topic.

Cheers,

Jeff

Attachments:

pgbench_scale.patchapplication/octet-stream; name=pgbench_scale.patchDownload+8-5
#20Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Janes (#19)
Re: Simulating Clog Contention

On Sat, Jan 28, 2012 at 3:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

I think that even in normal (non-initialization) usage, this message
should be suppressed when the provided scale factor
is equal to the pgbench_branches table count.

The attached patch does just that.  There is probably no reason to
warn people that we are doing what they told us to, but not for the
reason they think.

In my opinion, a more sensible approach than anything we're doing
right now would be to outright *reject* options that will only be
ignored. If -s isn't supported except with -i, then trying to specify
-s without -i should just error out at the options-parsing stage,
before we even try to connect to the database. It's not very helpful
to accept options and then ignore them, and we have many instances of
that right now: initialization-only switches are accepted and ignored
when not initializing, and run-only switches are accepted and ignored
with initializing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#21Jeff Janes
jeff.janes@gmail.com
In reply to: Robert Haas (#20)
#22Jeff Janes
jeff.janes@gmail.com
In reply to: Abhijit Menon-Sen (#14)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Janes (#22)
#24Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Janes (#21)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#23)