8.2 is 30% better in pgbench than 8.3

Started by Pavel Stehuleover 18 years ago30 messages
#1Pavel Stehule
pavel.stehule@gmail.com

Hello

I was little bit surprised. Is any reason for it?

Today I got:

(1) pgbench -c 30 -t 100
(2) pgbench -c 60 -t 100
(3) pgbench -c 30 -t 200
(4) pgbench -c 60 -t 200

autovacuum on
linux (fedora 7, ext3) 1GBRam, Pentium1.6M

Results 8.3 (result 8.2)

1. 300/303 (526/536)
3. 259/261 (423/429)
3. 310/312 (529/534)
4. 240/241 (413/416)

Regards
Pavel Stehule

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#1)
Re: 8.2 is 30% better in pgbench than 8.3

"Pavel Stehule" <pavel.stehule@gmail.com> writes:

I was little bit surprised. Is any reason for it?

Are you sure you're comparing apples to apples? In particular the
default autovacuuming setup is entirely different. With autovac off
I see 8.3 as faster than 8.2 in pgbench.

Also, remember a couple rules of thumb for choosing pgbench parameters:
keep -c less than the -s scale factor you used for pgbench -i (otherwise
you're mostly measuring update contention, because there are only -s
different rows in the branches table); and use -t at least 1000 or so
(otherwise startup transients are significant).

Note to all: we ***HAVE TO*** settle on some reasonable default
vacuum_cost_delay settings before we can ship 8.3. With no cost delay
and two or three workers active, 8.3's autovac does indeed send
performance into the tank.

regards, tom lane

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#2)
Re: 8.2 is 30% better in pgbench than 8.3

2007/7/21, Tom Lane <tgl@sss.pgh.pa.us>:

"Pavel Stehule" <pavel.stehule@gmail.com> writes:

I was little bit surprised. Is any reason for it?

Are you sure you're comparing apples to apples? In particular the
default autovacuuming setup is entirely different. With autovac off
I see 8.3 as faster than 8.2 in pgbench.

I am not sure. But this (or similar) test will do more persons, and
the difference have to be explained.

Also, remember a couple rules of thumb for choosing pgbench parameters:
keep -c less than the -s scale factor you used for pgbench -i (otherwise
you're mostly measuring update contention, because there are only -s
different rows in the branches table); and use -t at least 1000 or so
(otherwise startup transients are significant).

Ok, I have to do more tests.

Note to all: we ***HAVE TO*** settle on some reasonable default
vacuum_cost_delay settings before we can ship 8.3. With no cost delay
and two or three workers active, 8.3's autovac does indeed send
performance into the tank.

Thank you for reply

Pavel Stehule

#4Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#2)
Re: 8.2 is 30% better in pgbench than 8.3

On Sat, 21 Jul 2007, Tom Lane wrote:

With autovac off I see 8.3 as faster than 8.2 in pgbench.

Indeed. I'm seeing much better pgbench results from HEAD than 8.2 when I
set the configurations up identically. I'm hoping to have a comparison
set to show everyone this week.

and use -t at least 1000 or so (otherwise startup transients are
significant).

I personally consider any pgbench run that lasts less than several minutes
noise. On a system that hits 500 TPS like Pavel's, I'd want to see around
100,000 transactions before I consider the results significant. And then
I'd want a set of 3 at each configuration because even with longer runs,
you occasionally get really odd results. Until you have 3 it can be
unclear which is the weird one.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#2)
Re: 8.2 is 30% better in pgbench than 8.3

Hello,

I checked my tests again I have different results. Now I tested
PostgreSQL on dedicated server. Now 8.3 is about 20% faster. I didn't
see strong impression of autovacuum. All numbers are approximate
only. I did pgbench 3x for folowing configuration: (autovacuum on,
autovacuum off, statistics off) and for -tntransaction (100, 1000,
4000)

-t 100 629/638/639 630/630/646 581/654/656 -- 8.3
443/519/519 542/555/566 670/692/694 -- 8.2

-t 1000 622/626/653 630/635/653 631/631/652 -- 8.3
523/528/541 522/535/550 679/681/690 -- 8.2

-t 4000 632/635/644 248/385/651* 191/401/641* -- 8.3
465/472/520 237/336/538* 249/379/702* -- 8.2

8.2 is faster only if statistics are off

Note:
*I didn't use parametr v (do_vacuum_accounts)

I am sorry for noise

Regards
Pavel Stehule

#6Josh Berkus
josh@agliodbs.com
In reply to: Pavel Stehule (#5)
Re: 8.2 is 30% better in pgbench than 8.3

Pavel Stehule wrote:

Hello,

I checked my tests again I have different results. Now I tested
PostgreSQL on dedicated server. Now 8.3 is about 20% faster. I didn't
see strong impression of autovacuum. All numbers are approximate
only. I did pgbench 3x for folowing configuration: (autovacuum on,
autovacuum off, statistics off) and for -tntransaction (100, 1000,
4000)

In other news, 8.3 with current HOT is 13% faster than 8.2 at TPCE in
the first 1/2 hour. Performance does not fall over 5 hours of test run,
and most of the main tables never have autovacuum triggered at all.
Unfortnately, we don't yet have a 5-hour 8.2 run to compare
last-half-hour performance.

--Josh

#7Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#2)
Re: 8.2 is 30% better in pgbench than 8.3

Tom,

Note to all: we ***HAVE TO*** settle on some reasonable default
vacuum_cost_delay settings before we can ship 8.3. With no cost delay
and two or three workers active, 8.3's autovac does indeed send
performance into the tank.

I've been using 20ms for most of my setups. That's aimed at reducing
autovac to almost no impact at all, but taking a long time. Maybe 10ms?

--Josh

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#4)
Re: 8.2 is 30% better in pgbench than 8.3

Greg Smith <gsmith@gregsmith.com> writes:

I'd want a set of 3 at each configuration because even with longer runs,
you occasionally get really odd results. Until you have 3 it can be
unclear which is the weird one.

Yeah, pgbench results are notoriously unrepeatable. One issue is that
the first run after pgbench -i sees conditions a lot different from
subsequent runs (no free space in tables, indexes are more tightly
packed than they will be later, etc). The recently added option to
initialize the tables with a selected fillfactor might help here,
but personally I've not experimented with it.

There seems to be also some of the "good average but bad worst case"
behavior that Josh and others have pointed out in bigger benchmarks.
I've always assumed this was due to checkpointing (and autovac if
enabled). If your test run isn't long enough to cover one full
checkpoint cycle then the results will be quite variable depending
on whether it included a checkpoint or not. 8.3 might alleviate
this effect to some extent.

regards, tom lane

#9Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#8)
Re: 8.2 is 30% better in pgbench than 8.3

On Sun, 22 Jul 2007, Tom Lane wrote:

There seems to be also some of the "good average but bad worst case"
behavior that Josh and others have pointed out in bigger benchmarks.
I've always assumed this was due to checkpointing

At lot of variation is from checkpoints, some comes from the index
fullness/fragmentation, some from vacuuming, a bit is added based on
what's in the buffer cache from previous runs, and there's a touch of
randomness thrown on top even if you control all those things. My tests
suggest the "bad worst case" results in pgbench tests are almost always
from checkpoints that happen when the buffer cache is filled with almost
exclusively dirty buffers (which can happen very easily with pgbench).

I am in the process of squashing these issues when running pgbench against
8.3. A before/after look at pg_stat_bgwriter lets you measure the
checkpoint variation. The ability to directly play with the index
fillfactor lets you test/control the impact of that. Wrap all that into a
scripting framework that runs the tests many times for you in a consistant
fashion and then summarizes the results, and pgbench becomes a rough but
completely servicable tool. Expect to see a pgbench-tools project that
does all that from me soon, my copy works but it's not ready for public
consumption yet.

I've also got a working backport of all the pg_stat_bgwriter instrumenting
that applies to 8.2, so people can run all this against the current
version as well if they're in a position where they can hack a custom
build. The hold-up on that is that the code that measures buffers
allocated and those written by back-ends for 8.3 is wrapped into the
"Automatic adjustment of bgwriter_lru_maxpages" patch, and I'm not going
to finalize my unofficial backport until that gets applied. Once HOT
wraps up that loose end should get snipped easily enough.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#10Simon Riggs
simon@2ndquadrant.com
In reply to: Josh Berkus (#6)
Re: 8.2 is 30% better in pgbench than 8.3

On Sun, 2007-07-22 at 08:53 -0700, Josh Berkus wrote:

Pavel Stehule wrote:

Hello,

I checked my tests again I have different results. Now I tested
PostgreSQL on dedicated server. Now 8.3 is about 20% faster. I didn't
see strong impression of autovacuum. All numbers are approximate
only. I did pgbench 3x for folowing configuration: (autovacuum on,
autovacuum off, statistics off) and for -tntransaction (100, 1000,
4000)

In other news, 8.3 with current HOT is 13% faster than 8.2 at TPCE in
the first 1/2 hour. Performance does not fall over 5 hours of test run,
and most of the main tables never have autovacuum triggered at all.
Unfortnately, we don't yet have a 5-hour 8.2 run to compare
last-half-hour performance.

I think the rule of thumb is if the workload doesn't have enough UPDATEs
to trigger VACUUMs then HOT will have a low benefit.

With any workload, we should run it *until* we see some autovacuums
kick-in, so we can compare the overall situation of HOT v non-HOT. HOT
is designed for longer term benefit; fillfactor benefits fade over time
(as defined).

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#11Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: 8.2 is 30% better in pgbench than 8.3

On Sat, 2007-07-21 at 13:30 -0400, Tom Lane wrote:

Note to all: we ***HAVE TO*** settle on some reasonable default
vacuum_cost_delay settings before we can ship 8.3. With no cost delay
and two or three workers active, 8.3's autovac does indeed send
performance into the tank.

Couple of thoughts here:

HOT will reduce the need for VACUUMs quite considerably, so multiple
concurrent VACUUMs becomes a quite rare situation. We should re-evaluate
this thought once we have taken the go/no-go decision for HOT in 8.3

The good thing about having multiple autovacuum daemons active is that
this reduces the possibility of having small tables starved while a
single large VACUUM runs to completion. My experience is that a single
large table can take many hours possibly frequently preventing 10,000
VACUUMs of small tables within that time.

The bad thing about having multiple autovacuum daemons active is that
you can get two large VACUUMs running at the same time. This gives you
the same small-VACUUM-starvation problem we had before, but now the
effects of two VACUUMs kill performance even more. I would suggest that
we look at ways of queueing, so that multiple large VACUUMs cannot
occur. Setting vacuum_cost_delay will still allow multiple large VACUUMs
but will make the starvation problem even worse as well. If we allow
that situation to occur, I think I'd rather stick to autovac_workers=1.
We will still have this potential problem even with HOT.

Potential solution: Each autovac worker gets a range of table sizes they
are allowed to VACUUM. This is set with an additional parameter which is
an array of gating values (i.e. one less gating value than number of
autovac workers). That way small VACUUMs are never starved out by large
ones. This is the same as having a Small:Medium:Large style queueing
system. We can work out how to make the queueing system self-tune by
observation of autovacuum frequency.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#12Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Simon Riggs (#11)
Re: 8.2 is 30% better in pgbench than 8.3

Simon Riggs wrote:

The bad thing about having multiple autovacuum daemons active is that
you can get two large VACUUMs running at the same time. This gives you
the same small-VACUUM-starvation problem we had before, but now the
effects of two VACUUMs kill performance even more. I would suggest that
we look at ways of queueing, so that multiple large VACUUMs cannot
occur. Setting vacuum_cost_delay will still allow multiple large VACUUMs
but will make the starvation problem even worse as well. If we allow
that situation to occur, I think I'd rather stick to autovac_workers=1.
We will still have this potential problem even with HOT.

Potential solution: Each autovac worker gets a range of table sizes they
are allowed to VACUUM. This is set with an additional parameter which is
an array of gating values (i.e. one less gating value than number of
autovac workers). That way small VACUUMs are never starved out by large
ones. This is the same as having a Small:Medium:Large style queueing
system. We can work out how to make the queueing system self-tune by
observation of autovacuum frequency.

default autovac_workers is 3, so wouldn't you need three, not two, large
VACUUMs to starvate a smaller table?

Instead of queuing, how about increasing autovac_workers if starvation
is a concern?

I'd like to set a default autovacuum_vacuum_cost_delay anyway. Without
it, autovacuum is a performance hit when it kicks in, even if there's
only one of them running, and even if it only lasts for a short time.
It's an unpleasant surprise for someone who's new to PostgreSQL and
doesn't yet understand how vacuum and autovacuum works.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#13Simon Riggs
simon@2ndquadrant.com
In reply to: Heikki Linnakangas (#12)
Re: 8.2 is 30% better in pgbench than 8.3

On Mon, 2007-07-23 at 10:04 +0100, Heikki Linnakangas wrote:

Simon Riggs wrote:

The bad thing about having multiple autovacuum daemons active is that
you can get two large VACUUMs running at the same time. This gives you
the same small-VACUUM-starvation problem we had before, but now the
effects of two VACUUMs kill performance even more. I would suggest that
we look at ways of queueing, so that multiple large VACUUMs cannot
occur. Setting vacuum_cost_delay will still allow multiple large VACUUMs
but will make the starvation problem even worse as well. If we allow
that situation to occur, I think I'd rather stick to autovac_workers=1.
We will still have this potential problem even with HOT.

Potential solution: Each autovac worker gets a range of table sizes they
are allowed to VACUUM. This is set with an additional parameter which is
an array of gating values (i.e. one less gating value than number of
autovac workers). That way small VACUUMs are never starved out by large
ones. This is the same as having a Small:Medium:Large style queueing
system. We can work out how to make the queueing system self-tune by
observation of autovacuum frequency.

default autovac_workers is 3, so wouldn't you need three, not two, large
VACUUMs to starvate a smaller table?

Instead of queuing, how about increasing autovac_workers if starvation
is a concern?

Neither of those things prevent the problem, they just make it less
likely. I don't think thats a good answer for production systems that
have response time service level agreements to meet.

I'd like to set a default autovacuum_vacuum_cost_delay anyway. Without
it, autovacuum is a performance hit when it kicks in, even if there's
only one of them running, and even if it only lasts for a short time.
It's an unpleasant surprise for someone who's new to PostgreSQL and
doesn't yet understand how vacuum and autovacuum works.

I agree, but only if we can prevent the starvation problem while we do
it, otherwise it just gets worse.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#14Alvaro Herrera
alvherre@commandprompt.com
In reply to: Simon Riggs (#11)
Re: 8.2 is 30% better in pgbench than 8.3

Simon Riggs wrote:

The bad thing about having multiple autovacuum daemons active is that
you can get two large VACUUMs running at the same time. This gives you
the same small-VACUUM-starvation problem we had before, but now the
effects of two VACUUMs kill performance even more. I would suggest that
we look at ways of queueing, so that multiple large VACUUMs cannot
occur. Setting vacuum_cost_delay will still allow multiple large VACUUMs
but will make the starvation problem even worse as well. If we allow
that situation to occur, I think I'd rather stick to autovac_workers=1.
We will still have this potential problem even with HOT.

We already discussed all this to death before feature freeze. I'm not
sure if it's a good idea to try to come up with new heuristics for the
thing this late. Feel free to work on it for 8.4 though!

I also wonder whether you have noticed the "balancing" code in autovac.
Whenever more than one autovac workers are running, they split the
available I/O allocated to them fairly, so that each one delays more
frequently than if it was running alone. The net effect is supposed to
be that no matter how many workers are running, your vacuum delay
settings are respected.

In any case, I think a better solution to the starvation problem caused
by huge tables is not skipping the vacuuming of them, but making it less
wasteful, for example with the DSM.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#15Simon Riggs
simon@2ndquadrant.com
In reply to: Alvaro Herrera (#14)
Re: 8.2 is 30% better in pgbench than 8.3

On Mon, 2007-07-23 at 12:00 -0400, Alvaro Herrera wrote:

Simon Riggs wrote:

The bad thing about having multiple autovacuum daemons active is that
you can get two large VACUUMs running at the same time. This gives you
the same small-VACUUM-starvation problem we had before, but now the
effects of two VACUUMs kill performance even more. I would suggest that
we look at ways of queueing, so that multiple large VACUUMs cannot
occur. Setting vacuum_cost_delay will still allow multiple large VACUUMs
but will make the starvation problem even worse as well. If we allow
that situation to occur, I think I'd rather stick to autovac_workers=1.
We will still have this potential problem even with HOT.

We already discussed all this to death before feature freeze.

...and starvation has still not been avoided. I like what you have done,
but we still have a problem, whichever release it gets fixed in.

I'm not
sure if it's a good idea to try to come up with new heuristics for the
thing this late. Feel free to work on it for 8.4 though!

I also wonder whether you have noticed the "balancing" code in autovac.
Whenever more than one autovac workers are running, they split the
available I/O allocated to them fairly, so that each one delays more
frequently than if it was running alone. The net effect is supposed to
be that no matter how many workers are running, your vacuum delay
settings are respected.

I did and I like it, many thanks.

In any case, I think a better solution to the starvation problem caused
by huge tables is not skipping the vacuuming of them, but making it less
wasteful, for example with the DSM.

Neither of those things prevent starvation though.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#16Alvaro Herrera
alvherre@commandprompt.com
In reply to: Simon Riggs (#15)
Re: 8.2 is 30% better in pgbench than 8.3

Simon Riggs wrote:

On Mon, 2007-07-23 at 12:00 -0400, Alvaro Herrera wrote:

Simon Riggs wrote:

The bad thing about having multiple autovacuum daemons active is that
you can get two large VACUUMs running at the same time. This gives you
the same small-VACUUM-starvation problem we had before, but now the
effects of two VACUUMs kill performance even more.

We already discussed all this to death before feature freeze.

...and starvation has still not been avoided. I like what you have done,
but we still have a problem, whichever release it gets fixed in.

Oh I will the first to admit that autovacuum is still not "good enough".

In any case, I think a better solution to the starvation problem caused
by huge tables is not skipping the vacuuming of them, but making it less
wasteful, for example with the DSM.

Neither of those things prevent starvation though.

Certainly it doesn't prevent starvation completely -- really there is no
way to completely prevent starvation unless you have as many workers as
you have tables, and one disk for each. What DSM does do is let the big
tables be vacuumed quickly which makes most of the problem go away.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#17Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Alvaro Herrera (#16)
Re: 8.2 is 30% better in pgbench than 8.3

On 7/23/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Certainly it doesn't prevent starvation completely -- really there is no
way to completely prevent starvation unless you have as many workers as
you have tables, and one disk for each. What DSM does do is let the big
tables be vacuumed quickly which makes most of the problem go away.

Frankly I haven't seen DSM results very closely, but DSM can help
us avoid full heap scans (and thats a big thing!), but it can't avoid the
associated index scans and that might limit our ability to vacuum very
large tables frequently.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#18Jim Nasby
decibel@decibel.org
In reply to: Josh Berkus (#7)
Re: 8.2 is 30% better in pgbench than 8.3

On Jul 22, 2007, at 8:54 AM, Josh Berkus wrote:

Tom,

Note to all: we ***HAVE TO*** settle on some reasonable default
vacuum_cost_delay settings before we can ship 8.3. With no cost
delay
and two or three workers active, 8.3's autovac does indeed send
performance into the tank.

I've been using 20ms for most of my setups. That's aimed at
reducing autovac to almost no impact at all, but taking a long
time. Maybe 10ms?

I've found 20ms to be a pretty good number for run-of-the-mill IO
capability, and 10ms to be good for a good RAID setup (RAID10, 8+
drives, BBU).

For a default setting, I think it'd be better to lean towards 20ms.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#19Jim Nasby
decibel@decibel.org
In reply to: Simon Riggs (#11)
Re: 8.2 is 30% better in pgbench than 8.3

On Jul 23, 2007, at 1:40 AM, Simon Riggs wrote:

Potential solution: Each autovac worker gets a range of table sizes
they
are allowed to VACUUM.

Anyone putting thought into this should check the archives; there was
quite a bit of discussion around it. For 8.3 we decided to KISS so
that we'd get in the release, but we really do need to tackle the
starvation issue for 8.4.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#20Alvaro Herrera
alvherre@commandprompt.com
In reply to: Pavan Deolasee (#17)
Re: 8.2 is 30% better in pgbench than 8.3

Pavan Deolasee wrote:

On 7/23/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Certainly it doesn't prevent starvation completely -- really there is no
way to completely prevent starvation unless you have as many workers as
you have tables, and one disk for each. What DSM does do is let the big
tables be vacuumed quickly which makes most of the problem go away.

Frankly I haven't seen DSM results very closely, but DSM can help
us avoid full heap scans (and thats a big thing!), but it can't avoid the
associated index scans and that might limit our ability to vacuum very
large tables frequently.

I haven't seen DSM either so IMBFoS. You are right about index scans
though. Fortunately they are not as expensive as they used to be thanks
to Heikki's changes to allow physical order scanning.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#21Alvaro Herrera
alvherre@commandprompt.com
In reply to: Jim Nasby (#18)
1 attachment(s)
Re: 8.2 is 30% better in pgbench than 8.3

Jim Nasby wrote:

On Jul 22, 2007, at 8:54 AM, Josh Berkus wrote:

Tom,

Note to all: we ***HAVE TO*** settle on some reasonable default
vacuum_cost_delay settings before we can ship 8.3. With no cost delay
and two or three workers active, 8.3's autovac does indeed send
performance into the tank.

I've been using 20ms for most of my setups. That's aimed at reducing
autovac to almost no impact at all, but taking a long time. Maybe 10ms?

I've found 20ms to be a pretty good number for run-of-the-mill IO
capability, and 10ms to be good for a good RAID setup (RAID10, 8+ drives,
BBU).

For a default setting, I think it'd be better to lean towards 20ms.

OK, 20ms it is then. Here is a patch. I am taking the liberty to also
lower the vacuum and analyze threshold default values to 50, per
previous discussion.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachments:

autovac-params.patchtext/x-diff; charset=us-asciiDownload
? msg
? src/tools/entab/entab
? src/tools/entab/entab.fix.diff
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.405
diff -c -p -r1.405 guc.c
*** src/backend/utils/misc/guc.c	10 Jul 2007 13:14:21 -0000	1.405
--- src/backend/utils/misc/guc.c	23 Jul 2007 17:50:01 -0000
*************** static struct config_int ConfigureNamesI
*** 1348,1354 ****
  			GUC_UNIT_MS
  		},
  		&autovacuum_vac_cost_delay,
! 		-1, -1, 1000, NULL, NULL
  	},
  
  	{
--- 1348,1354 ----
  			GUC_UNIT_MS
  		},
  		&autovacuum_vac_cost_delay,
! 		20, -1, 1000, NULL, NULL
  	},
  
  	{
*************** static struct config_int ConfigureNamesI
*** 1656,1662 ****
  			NULL
  		},
  		&autovacuum_vac_thresh,
! 		500, 0, INT_MAX, NULL, NULL
  	},
  	{
  		{"autovacuum_analyze_threshold", PGC_SIGHUP, AUTOVACUUM,
--- 1656,1662 ----
  			NULL
  		},
  		&autovacuum_vac_thresh,
! 		50, 0, INT_MAX, NULL, NULL
  	},
  	{
  		{"autovacuum_analyze_threshold", PGC_SIGHUP, AUTOVACUUM,
*************** static struct config_int ConfigureNamesI
*** 1664,1670 ****
  			NULL
  		},
  		&autovacuum_anl_thresh,
! 		250, 0, INT_MAX, NULL, NULL
  	},
  	{
  		/* see varsup.c for why this is PGC_POSTMASTER not PGC_SIGHUP */
--- 1664,1670 ----
  			NULL
  		},
  		&autovacuum_anl_thresh,
! 		50, 0, INT_MAX, NULL, NULL
  	},
  	{
  		/* see varsup.c for why this is PGC_POSTMASTER not PGC_SIGHUP */
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.219
diff -c -p -r1.219 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	10 Jul 2007 13:14:21 -0000	1.219
--- src/backend/utils/misc/postgresql.conf.sample	23 Jul 2007 17:50:01 -0000
***************
*** 381,389 ****
  #log_autovacuum = -1			# -1 is disabled, 0 logs all actions
  					# and their durations, > 0 logs only
  					# actions running at least N msec.
! #autovacuum_vacuum_threshold = 500	# min # of tuple updates before
  					# vacuum
! #autovacuum_analyze_threshold = 250	# min # of tuple updates before 
  					# analyze
  #autovacuum_vacuum_scale_factor = 0.2	# fraction of rel size before 
  					# vacuum
--- 381,389 ----
  #log_autovacuum = -1			# -1 is disabled, 0 logs all actions
  					# and their durations, > 0 logs only
  					# actions running at least N msec.
! #autovacuum_vacuum_threshold = 50	# min # of tuple updates before
  					# vacuum
! #autovacuum_analyze_threshold = 50	# min # of tuple updates before 
  					# analyze
  #autovacuum_vacuum_scale_factor = 0.2	# fraction of rel size before 
  					# vacuum
***************
*** 391,397 ****
  					# analyze
  #autovacuum_freeze_max_age = 200000000	# maximum XID age before forced vacuum
  					# (change requires restart)
! #autovacuum_vacuum_cost_delay = -1	# default vacuum cost delay for 
  					# autovacuum, -1 means use 
  					# vacuum_cost_delay
  #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for 
--- 391,397 ----
  					# analyze
  #autovacuum_freeze_max_age = 200000000	# maximum XID age before forced vacuum
  					# (change requires restart)
! #autovacuum_vacuum_cost_delay = 20	# default vacuum cost delay for 
  					# autovacuum, -1 means use 
  					# vacuum_cost_delay
  #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for 
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#21)
Re: 8.2 is 30% better in pgbench than 8.3

Alvaro Herrera <alvherre@commandprompt.com> writes:

OK, 20ms it is then. Here is a patch. I am taking the liberty to also
lower the vacuum and analyze threshold default values to 50, per
previous discussion.

Patch probably needs to touch docs (config.sgml at least) too?

regards, tom lane

#23Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#22)
1 attachment(s)
Re: 8.2 is 30% better in pgbench than 8.3

Tom Lane wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

OK, 20ms it is then. Here is a patch. I am taking the liberty to also
lower the vacuum and analyze threshold default values to 50, per
previous discussion.

Patch probably needs to touch docs (config.sgml at least) too?

That's right -- I was happy because I checked maintenance.sgml and
catalog.sgml and they didn't need any update, I forgot to check
config.sgml.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachments:

autovac-params-2.patchtext/x-diff; charset=us-asciiDownload
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.131
diff -c -p -r1.131 config.sgml
*** doc/src/sgml/config.sgml	18 Jul 2007 12:00:47 -0000	1.131
--- doc/src/sgml/config.sgml	24 Jul 2007 00:22:46 -0000
*************** SELECT * FROM parent WHERE key = 2400;
*** 3247,3253 ****
         <para>
          Specifies the minimum number of updated or deleted tuples needed
          to trigger a <command>VACUUM</> in any one table.
!         The default is 500 tuples.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
          This setting can be overridden for individual tables by entries in
--- 3247,3253 ----
         <para>
          Specifies the minimum number of updated or deleted tuples needed
          to trigger a <command>VACUUM</> in any one table.
!         The default is 50 tuples.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
          This setting can be overridden for individual tables by entries in
*************** SELECT * FROM parent WHERE key = 2400;
*** 3265,3271 ****
         <para>
          Specifies the minimum number of inserted, updated or deleted tuples
          needed to trigger an <command>ANALYZE</> in any one table.
!         The default is 250 tuples.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
          This setting can be overridden for individual tables by entries in
--- 3265,3271 ----
         <para>
          Specifies the minimum number of inserted, updated or deleted tuples
          needed to trigger an <command>ANALYZE</> in any one table.
!         The default is 50 tuples.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
          This setting can be overridden for individual tables by entries in
*************** SELECT * FROM parent WHERE key = 2400;
*** 3343,3350 ****
         <para>
          Specifies the cost delay value that will be used in automatic
          <command>VACUUM</> operations.  If <literal>-1</> is
!         specified (which is the default), the regular
          <xref linkend="guc-vacuum-cost-delay"> value will be used.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
          This setting can be overridden for individual tables by entries in
--- 3343,3351 ----
         <para>
          Specifies the cost delay value that will be used in automatic
          <command>VACUUM</> operations.  If <literal>-1</> is
!         specified, the regular
          <xref linkend="guc-vacuum-cost-delay"> value will be used.
+         The default value is 20 milliseconds.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
          This setting can be overridden for individual tables by entries in
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.405
diff -c -p -r1.405 guc.c
*** src/backend/utils/misc/guc.c	10 Jul 2007 13:14:21 -0000	1.405
--- src/backend/utils/misc/guc.c	24 Jul 2007 00:22:51 -0000
*************** static struct config_int ConfigureNamesI
*** 1348,1354 ****
  			GUC_UNIT_MS
  		},
  		&autovacuum_vac_cost_delay,
! 		-1, -1, 1000, NULL, NULL
  	},
  
  	{
--- 1348,1354 ----
  			GUC_UNIT_MS
  		},
  		&autovacuum_vac_cost_delay,
! 		20, -1, 1000, NULL, NULL
  	},
  
  	{
*************** static struct config_int ConfigureNamesI
*** 1656,1662 ****
  			NULL
  		},
  		&autovacuum_vac_thresh,
! 		500, 0, INT_MAX, NULL, NULL
  	},
  	{
  		{"autovacuum_analyze_threshold", PGC_SIGHUP, AUTOVACUUM,
--- 1656,1662 ----
  			NULL
  		},
  		&autovacuum_vac_thresh,
! 		50, 0, INT_MAX, NULL, NULL
  	},
  	{
  		{"autovacuum_analyze_threshold", PGC_SIGHUP, AUTOVACUUM,
*************** static struct config_int ConfigureNamesI
*** 1664,1670 ****
  			NULL
  		},
  		&autovacuum_anl_thresh,
! 		250, 0, INT_MAX, NULL, NULL
  	},
  	{
  		/* see varsup.c for why this is PGC_POSTMASTER not PGC_SIGHUP */
--- 1664,1670 ----
  			NULL
  		},
  		&autovacuum_anl_thresh,
! 		50, 0, INT_MAX, NULL, NULL
  	},
  	{
  		/* see varsup.c for why this is PGC_POSTMASTER not PGC_SIGHUP */
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.219
diff -c -p -r1.219 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	10 Jul 2007 13:14:21 -0000	1.219
--- src/backend/utils/misc/postgresql.conf.sample	24 Jul 2007 00:22:52 -0000
***************
*** 381,389 ****
  #log_autovacuum = -1			# -1 is disabled, 0 logs all actions
  					# and their durations, > 0 logs only
  					# actions running at least N msec.
! #autovacuum_vacuum_threshold = 500	# min # of tuple updates before
  					# vacuum
! #autovacuum_analyze_threshold = 250	# min # of tuple updates before 
  					# analyze
  #autovacuum_vacuum_scale_factor = 0.2	# fraction of rel size before 
  					# vacuum
--- 381,389 ----
  #log_autovacuum = -1			# -1 is disabled, 0 logs all actions
  					# and their durations, > 0 logs only
  					# actions running at least N msec.
! #autovacuum_vacuum_threshold = 50	# min # of tuple updates before
  					# vacuum
! #autovacuum_analyze_threshold = 50	# min # of tuple updates before 
  					# analyze
  #autovacuum_vacuum_scale_factor = 0.2	# fraction of rel size before 
  					# vacuum
***************
*** 391,397 ****
  					# analyze
  #autovacuum_freeze_max_age = 200000000	# maximum XID age before forced vacuum
  					# (change requires restart)
! #autovacuum_vacuum_cost_delay = -1	# default vacuum cost delay for 
  					# autovacuum, -1 means use 
  					# vacuum_cost_delay
  #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for 
--- 391,397 ----
  					# analyze
  #autovacuum_freeze_max_age = 200000000	# maximum XID age before forced vacuum
  					# (change requires restart)
! #autovacuum_vacuum_cost_delay = 20	# default vacuum cost delay for 
  					# autovacuum, -1 means use 
  					# vacuum_cost_delay
  #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for 
#24Gregory Stark
stark@enterprisedb.com
In reply to: Alvaro Herrera (#23)
Re: 8.2 is 30% better in pgbench than 8.3

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

Alvaro Herrera <alvherre@commandprompt.com> writes:

I am taking the liberty to also lower the vacuum and analyze threshold
default values to 50, per previous discussion.

Did we also reach any consensus about lowering the percentage of dead tuples
in a table before we trigger vacuum? I think 20% is way too high and 5% is
saner. I actually think it would be better even lower but would be ok with 5%.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

#25Joshua D. Drake
jd@commandprompt.com
In reply to: Gregory Stark (#24)
Re: 8.2 is 30% better in pgbench than 8.3

Gregory Stark wrote:

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

Alvaro Herrera <alvherre@commandprompt.com> writes:

I am taking the liberty to also lower the vacuum and analyze threshold
default values to 50, per previous discussion.

Did we also reach any consensus about lowering the percentage of dead tuples
in a table before we trigger vacuum? I think 20% is way too high and 5% is
saner. I actually think it would be better even lower but would be ok with 5%.

I think that 5-10% is reasonable, 20% is way to high.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

#26Alvaro Herrera
alvherre@commandprompt.com
In reply to: Gregory Stark (#24)
autovacuum default parameters (was Re: 8.2 is 30% better in pgbench than 8.3)

Gregory Stark wrote:

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

Alvaro Herrera <alvherre@commandprompt.com> writes:

I am taking the liberty to also lower the vacuum and analyze threshold
default values to 50, per previous discussion.

Did we also reach any consensus about lowering the percentage of dead tuples
in a table before we trigger vacuum? I think 20% is way too high and 5% is
saner. I actually think it would be better even lower but would be ok with 5%.

We didn't, but while I agree with the idea, I think 5% is too low. I
don't want autovacuum to get excessively aggressive. Is 10% not enough?

How about the analyze scale factor, should we keep the current 10%? I
have less of a problem with reducing it further since analyze is cheaper
than vacuum.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#27Joshua D. Drake
jd@commandprompt.com
In reply to: Alvaro Herrera (#26)
Re: autovacuum default parameters (was Re: 8.2 is 30% better in pgbench than 8.3)

Alvaro Herrera wrote:

Gregory Stark wrote:

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

Alvaro Herrera <alvherre@commandprompt.com> writes:

I am taking the liberty to also lower the vacuum and analyze threshold
default values to 50, per previous discussion.

Did we also reach any consensus about lowering the percentage of dead tuples
in a table before we trigger vacuum? I think 20% is way too high and 5% is
saner. I actually think it would be better even lower but would be ok with 5%.

We didn't, but while I agree with the idea, I think 5% is too low. I
don't want autovacuum to get excessively aggressive. Is 10% not enough?

It depends really. 10% on a small table seems like a waste except that
small tables are quick to vacuum. 10% on a table with 20 million rows,
is a lot of dead rows.

Joshua D. Drake

How about the analyze scale factor, should we keep the current 10%? I
have less of a problem with reducing it further since analyze is cheaper
than vacuum.

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

#28ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Alvaro Herrera (#26)
Re: autovacuum default parameters (was Re: 8.2 is 30% better in pgbench than 8.3)

Alvaro Herrera <alvherre@commandprompt.com> wrote:

We didn't, but while I agree with the idea, I think 5% is too low. I
don't want autovacuum to get excessively aggressive. Is 10% not enough?

I think the threshold should be a little less than PCTFREE of indexes,
to avoid splitting of btree leaves. It might be good to decrease the
fillfactor to 85% or so when we choise 10% for it.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#29Gregory Stark
stark@enterprisedb.com
In reply to: Alvaro Herrera (#26)
Re: autovacuum default parameters

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

We didn't, but while I agree with the idea, I think 5% is too low. I
don't want autovacuum to get excessively aggressive. Is 10% not enough?

Well let me flip it around. Would you think a default fillfactor of 10% would
be helpful or overkill? I think it would nearly always be overkill and waste
heap space and therefore cache hit rate and i/o bandwidth.

I get my 5% intuition from the TPCC stock table which has about 20 tuples per
page. That means a fillfactor or vacuum at 5% both translate into trying to
maintain a margin of one tuple's worth of space per page. Enough for an update
to happen without migrating to a new page.

That's actually a fairly wide table though. A narrower table could easily have
50-100 tuple per page which would require only 1-2% of dead space overhead.

<idle speculation>

Perhaps the two parameters should be tied together and we should make the
autovacuum parameter: max(1%, min(10%, fillfactor(table))) and make the
default fill factor 5%.

Hm. We have the width of the table in the stats don't we? We could actually
calculate the "1 tuple's worth of space" percentage automatically on a
per-table basis. Or for that matter instead of calculating it as a percentage
of the whole table, just compare the number of updates/deletes with the number
of pages in the table.

</speculation>

How about the analyze scale factor, should we keep the current 10%? I
have less of a problem with reducing it further since analyze is cheaper
than vacuum.

My "try to maintain one tuple's worth of space" model doesn't answer this
question at all. It depends entirely on whether the ddl is changing the data
distribution.

Perhaps this should be 1/max(stats_target) for the table. So the default would
be 10% but if you raise the stats_target for a column to 100 it would go down
to 1% or so.

The idea being that if you have ten buckets then updating 1/10th of the rows
stands an even chance of doubling or halving the size of your bucket. Except
there's no math behind that intuition at all and I rather doubt it makes much
sense.

Actually I feel like there should be a factor of 2 or more in there as well.
If you modify 1/10th of the rows and you have 10 buckets then we should be
analyzing *before* the distribution has a chance to be modified beyond
recognition.

Perhaps I shouldn't have closed the <speculation> tag so early :) The problem
if we try to calculate reasonable defaults like this is it makes it unclear
how to expose any knob for the user to adjust it if they need to.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

#30Jim Nasby
decibel@decibel.org
In reply to: Gregory Stark (#29)
Re: autovacuum default parameters

On Jul 24, 2007, at 1:02 AM, Gregory Stark wrote:

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

We didn't, but while I agree with the idea, I think 5% is too low. I
don't want autovacuum to get excessively aggressive. Is 10% not
enough?

Well let me flip it around. Would you think a default fillfactor of
10% would
be helpful or overkill? I think it would nearly always be overkill
and waste
heap space and therefore cache hit rate and i/o bandwidth.

I get my 5% intuition from the TPCC stock table which has about 20
tuples per
page. That means a fillfactor or vacuum at 5% both translate into
trying to
maintain a margin of one tuple's worth of space per page. Enough
for an update
to happen without migrating to a new page.

That's actually a fairly wide table though. A narrower table could
easily have
50-100 tuple per page which would require only 1-2% of dead space
overhead.

<idle speculation>

Perhaps the two parameters should be tied together and we should
make the
autovacuum parameter: max(1%, min(10%, fillfactor(table))) and make
the
default fill factor 5%.

Hm. We have the width of the table in the stats don't we? We could
actually
calculate the "1 tuple's worth of space" percentage automatically on a
per-table basis. Or for that matter instead of calculating it as a
percentage
of the whole table, just compare the number of updates/deletes with
the number
of pages in the table.

</speculation>

How about the analyze scale factor, should we keep the current
10%? I
have less of a problem with reducing it further since analyze is
cheaper
than vacuum.

My "try to maintain one tuple's worth of space" model doesn't
answer this
question at all. It depends entirely on whether the ddl is changing
the data
distribution.

Perhaps this should be 1/max(stats_target) for the table. So the
default would
be 10% but if you raise the stats_target for a column to 100 it
would go down
to 1% or so.

The idea being that if you have ten buckets then updating 1/10th of
the rows
stands an even chance of doubling or halving the size of your
bucket. Except
there's no math behind that intuition at all and I rather doubt it
makes much
sense.

Actually I feel like there should be a factor of 2 or more in there
as well.
If you modify 1/10th of the rows and you have 10 buckets then we
should be
analyzing *before* the distribution has a chance to be modified beyond
recognition.

Perhaps I shouldn't have closed the <speculation> tag so early :)
The problem
if we try to calculate reasonable defaults like this is it makes it
unclear
how to expose any knob for the user to adjust it if they need to.

In reality, I think trying to get much below 10% on any large-ish
production systems just isn't going to work well. It's starting to
approach the point where you need to be vacuuming continuously, which
is going to put us right back into starvation territory.

Put another way, there's only so low you can get table bloat with
vacuum as it currently stands. If you want to do better, you need
things like HOT and DSM.

Regarding page splits, it might make sense to drop the fillfactor a
bit. I'm thinking that in most cases, the difference between 85% and
90% won't be noticed. For cases where it will matter (ie: insert-
only), you'd want to set fillfactor to 100% anyway.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)