More detail on settings for pgavd?

Started by Josh Berkusover 22 years ago35 messageshackers
Jump to latest
#1Josh Berkus
josh@agliodbs.com

Shridhar,

I was looking at the -V/-v and -A/-a settings in pgavd, and really don't
understand how the calculation works. According to the readme, if I set -v
to 1000 and -V to 2 (the defaults) for a table with 10,000 rows, pgavd would
only vacuum after 21,000 rows had been updated. This seems wrong.

Can you clear this up a little? I'd like to tweak these settings but can't
without being better aquainted with the calculation.

Also, you may want to reverse your default ratio for Vacuum/analyze frequency.
True, analyze is a less expensive operation than Vacuum, but it's also needed
less often -- only when the *distribution* of data changes. I've seen
databases where the optimal vacuum/analyze frequency was every 10 min/once
per day.

--
-Josh Berkus
Aglio Database Solutions
San Francisco

#2Shridhar Daithankar
shridhar_daithankar@myrealbox.com
In reply to: Josh Berkus (#1)
Re: More detail on settings for pgavd?

Josh Berkus wrote:

Shridhar,

I was looking at the -V/-v and -A/-a settings in pgavd, and really don't
understand how the calculation works. According to the readme, if I set -v
to 1000 and -V to 2 (the defaults) for a table with 10,000 rows, pgavd would
only vacuum after 21,000 rows had been updated. This seems wrong.

Can you clear this up a little? I'd like to tweak these settings but can't
without being better aquainted with the calculation.

Also, you may want to reverse your default ratio for Vacuum/analyze frequency.
True, analyze is a less expensive operation than Vacuum, but it's also needed
less often -- only when the *distribution* of data changes. I've seen
databases where the optimal vacuum/analyze frequency was every 10 min/once
per day.

Will look into it. Give me a day or so. I am planning couple of other patches as
well. May be over week end.

Is this urgent?

Shridhar

#3Shridhar Daithankar
shridhar_daithankar@myrealbox.com
In reply to: Josh Berkus (#1)
Re: More detail on settings for pgavd?

Josh Berkus wrote:

Shridhar,

I was looking at the -V/-v and -A/-a settings in pgavd, and really don't
understand how the calculation works. According to the readme, if I set -v
to 1000 and -V to 2 (the defaults) for a table with 10,000 rows, pgavd would
only vacuum after 21,000 rows had been updated. This seems wrong.

No. that is correct.

It is calculated as

threshold = base + scale*numebr of current rows

Which translates to

21,000 = 1000 + 2*1000

However I do not agree with this logic entirely. It pegs the next vacuum w.r.t
current table size which is not always a good thing.

I would rather vacuum the table at 2000 updates, which is what you probably want.

Furthermore analyze threshold depends upon inserts+updates. I think it should
also depends upon deletes for obvious reasons.

Can you clear this up a little? I'd like to tweak these settings but can't
without being better aquainted with the calculation.

What did you expected in above example? It is not difficult to tweak
pg_autovacuum calculations. For testing we can play around.

Also, you may want to reverse your default ratio for Vacuum/analyze frequency.
True, analyze is a less expensive operation than Vacuum, but it's also needed
less often -- only when the *distribution* of data changes. I've seen
databases where the optimal vacuum/analyze frequency was every 10 min/once
per day.

OK vacuum and analyze thresholds are calculated with same formula as shown above
but with different parameters as follows.

vacthresh = vacbase + vacscale*ntuples
anathresh = anabase + anascale*ntuples

What you are asking for is

vacthresh = vacbase*vacscale
anathresh = anabase + anascale*ntuples

Would that tilt the favour the way you want? i.e. an analyze is triggered when a
fixed *percentage* of table changes but a vacuum is triggered when a fixed
*number of rows* are changed.

I am all for experimentation. If you have real life data to play with, I can
give you some patches to play around.

And BTW, this is all brain child of Mathew O.Connor(Correct? I am not good at
either names or spellings). The way I wrote pgavd originally, each table got to
get separate threshold..:-). That was rather a brute force approach.

Shridhar

#4Josh Berkus
josh@agliodbs.com
In reply to: Shridhar Daithankar (#2)
Re: More detail on settings for pgavd?

Shridhar,

Will look into it. Give me a day or so. I am planning couple of other
patches as well. May be over week end.

Thanks, appreciated. As I said, I don't think the settings themselves are
wrong, I think the documentation is.

What are you patching?

--
Josh Berkus
Aglio Database Solutions
San Francisco

#5Josh Berkus
josh@agliodbs.com
In reply to: Shridhar Daithankar (#3)
Re: More detail on settings for pgavd?

Shridhar,

However I do not agree with this logic entirely. It pegs the next vacuum
w.r.t current table size which is not always a good thing.

No, I think the logic's fine, it's the numbers which are wrong. We want to
vacuum when updates reach between 5% and 15% of total rows. NOT when
updates reach 110% of total rows ... that's much too late.

Hmmm ... I also think the threshold level needs to be lowered; I guess the
purpose was to prevent continuous re-vacuuuming of small tables?
Unfortunately, in the current implementation, the result is tha small tables
never get vacuumed at all.

So for defaults, I would peg -V at 0.1 and -v at 100, so our default
calculation for a table with 10,000 rows is:

100 + ( 0.1 * 10,000 ) = 1100 rows.

I would rather vacuum the table at 2000 updates, which is what you probably
want.

Not necessarily. This would be painful if the table has 10,000,000 rows. It
*should* be based on a % of rows.

Furthermore analyze threshold depends upon inserts+updates. I think it
should also depends upon deletes for obvious reasons.

Yes. Vacuum threshold is counting deletes, I hope?

What did you expected in above example? It is not difficult to tweak
pg_autovacuum calculations. For testing we can play around.

Can I set the settings to decimals, or are they integers?

vacthresh = vacbase*vacscale
anathresh = anabase + anascale*ntuples

Nope, see above.

My comment about the frequency of vacuums vs. analyze is that currently the
*default* is to analyze twice as often as you vacuum. Based on my
experiece as a PG admin on a variety of databases, I believe that the default
should be to analyze half as often as you vacuum.

I am all for experimentation. If you have real life data to play with, I
can give you some patches to play around.

I will have real data very soon .....

--
Josh Berkus
Aglio Database Solutions
San Francisco

#6Shridhar Daithankar
shridhar_daithankar@myrealbox.com
In reply to: Josh Berkus (#5)
Re: More detail on settings for pgavd?

Josh Berkus wrote:

Shridhar,

However I do not agree with this logic entirely. It pegs the next vacuum
w.r.t current table size which is not always a good thing.

No, I think the logic's fine, it's the numbers which are wrong. We want to
vacuum when updates reach between 5% and 15% of total rows. NOT when
updates reach 110% of total rows ... that's much too late.

Well, looks like thresholds below 1 should be norm rather than exception.

Hmmm ... I also think the threshold level needs to be lowered; I guess the
purpose was to prevent continuous re-vacuuuming of small tables?
Unfortunately, in the current implementation, the result is tha small tables
never get vacuumed at all.

So for defaults, I would peg -V at 0.1 and -v at 100, so our default
calculation for a table with 10,000 rows is:

100 + ( 0.1 * 10,000 ) = 1100 rows.

I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that
thresholds less than 1 should be used.

Furthermore analyze threshold depends upon inserts+updates. I think it
should also depends upon deletes for obvious reasons.

Yes. Vacuum threshold is counting deletes, I hope?

It does.

My comment about the frequency of vacuums vs. analyze is that currently the
*default* is to analyze twice as often as you vacuum. Based on my
experiece as a PG admin on a variety of databases, I believe that the default
should be to analyze half as often as you vacuum.

OK.

I am all for experimentation. If you have real life data to play with, I
can give you some patches to play around.

I will have real data very soon .....

I will submit a patch that would account deletes in analyze threshold. Since you
want to delay the analyze, I would calculate analyze count as

n=updates + inserts *-* deletes

Rather than current "n = updates + inserts". Also update readme about examples
and analyze frequency.

What does statistics gather BTW? Just number of rows or something else as well?
I think I would put that on Hackers separately.

I am still wary of inverting vacuum analyze frequency. You think it is better to
set inverted default rather than documenting it?

Shridhar

#7Matthew T. O'Connor
matthew@zeut.net
In reply to: Shridhar Daithankar (#6)
Re: [PERFORM] More detail on settings for pgavd?

Shridhar Daithankar wrote:

Josh Berkus wrote:

Shridhar,

However I do not agree with this logic entirely. It pegs the next

vacuum

w.r.t current table size which is not always a good thing.

Ok, what do you recommend? The point of two separate variables allows
you to specify if you want vacuum based on a fixed number, based on
table size or something inbetween.

No, I think the logic's fine, it's the numbers which are wrong. We
want to vacuum when updates reach between 5% and 15% of total rows.
NOT when updates reach 110% of total rows ... that's much too late.

For small tables, you don't need to vacuum too often. In the testing I
did a small table ~100 rows, didn't really show significant performance
degredation until it had close to 1000 updates. For large tables,
vacuum is so expensive, that you don't want to do it very often, and
scanning the whole table when there is only 5% wasted space is not very
helpful.

Hmmm ... I also think the threshold level needs to be lowered; I
guess the purpose was to prevent continuous re-vacuuuming of small
tables? Unfortunately, in the current implementation, the result is
tha small tables never get vacuumed at all.

So for defaults, I would peg -V at 0.1 and -v at 100, so our default
calculation for a table with 10,000 rows is:

100 + ( 0.1 * 10,000 ) = 1100 rows.

Yes, the I set the defaults a little high perhaps so as to err on the
side of caution. I didn't want people to say pg_autovacuum kills the
performance of my server. A small table will get vacuumed, just not
until it has reached the threshold. So a table with 100 rows, will get
vacuumed after 1200 updates / deletes. In my testing it showed that
there was no major performance problems until you reached several
thousand updates / deletes.

Furthermore analyze threshold depends upon inserts+updates. I think it
should also depends upon deletes for obvious reasons.

Yes. Vacuum threshold is counting deletes, I hope?

It does.

My comment about the frequency of vacuums vs. analyze is that
currently the *default* is to analyze twice as often as you
vacuum. Based on my experiece as a PG admin on a variety of
databases, I believe that the default should be to analyze half as
often as you vacuum.

HUH? analyze is very very cheap compared to vacuum. Why not do it more
often?

I am all for experimentation. If you have real life data to play
with, I
can give you some patches to play around.

I will have real data very soon .....

I will submit a patch that would account deletes in analyze threshold.
Since you want to delay the analyze, I would calculate analyze count as

deletes are already accounted for in the analyze threshold.

I am still wary of inverting vacuum analyze frequency. You think it is
better to set inverted default rather than documenting it?

I think inverting the vacuum and analyze frequency is wrong.

What I think I am hearing is that people would like very much to be able
to tweak the settings of pg_autovacuum for individual tables / databases
etc. So that you could set certain tables to be vacuumed more
agressivly than others. I agree this would be a good and welcome
addition. I hope have time to work on this at some point, but in the
near future I won't.

Matthew

#8Shridhar Daithankar
shridhar_daithankar@myrealbox.com
In reply to: Matthew T. O'Connor (#7)
Re: [HACKERS] More detail on settings for pgavd?

On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:

Shridhar Daithankar wrote:

I will submit a patch that would account deletes in analyze threshold.
Since you want to delay the analyze, I would calculate analyze count as

deletes are already accounted for in the analyze threshold.

Yes. My bad. Deletes are not accounted in initializing analyze count but later
they are used.

I am still wary of inverting vacuum analyze frequency. You think it is
better to set inverted default rather than documenting it?

I think inverting the vacuum and analyze frequency is wrong.

Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient
for you?..:-)

Matthew, I am confyused about one thing. Why would autovacuum count updates
while checking for analyze threshold? Analyze does not change statistics
right? ( w.r.t line 1072, pg_autovacuum.c). For updating statistics, only
inserts+deletes should suffice, isn't it?

Other than that, I think autovacuum does everything it can.

Comments?

Shridhar

Attachments:

difffiletext/x-diff; charset=iso-8859-1; name=difffileDownload+27-4
#9Shridhar Daithankar
shridhar_daithankar@myrealbox.com
In reply to: Shridhar Daithankar (#8)
Re: [HACKERS] More detail on settings for pgavd?

On Thursday 20 November 2003 20:29, Shridhar Daithankar wrote:

On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:

Shridhar Daithankar wrote:

I will submit a patch that would account deletes in analyze threshold.
Since you want to delay the analyze, I would calculate analyze count as

deletes are already accounted for in the analyze threshold.

Yes. My bad. Deletes are not accounted in initializing analyze count but
later they are used.

I am still wary of inverting vacuum analyze frequency. You think it is
better to set inverted default rather than documenting it?

I think inverting the vacuum and analyze frequency is wrong.

Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient
for you?..:-)

use this one. A warning added for too aggressive vacuumming. If it is OK by
everybody, we can send it to patches list.

Shridhar

Attachments:

difffiletext/x-diff; charset=iso-8859-1; name=difffileDownload+40-12
#10Matthew T. O'Connor
matthew@zeut.net
In reply to: Shridhar Daithankar (#8)
Re: [PERFORM] More detail on settings for pgavd?

Shridhar Daithankar wrote:

On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:

Shridhar Daithankar wrote:

I am still wary of inverting vacuum analyze frequency. You think it is
better to set inverted default rather than documenting it?

I think inverting the vacuum and analyze frequency is wrong.

Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient
for you?..:-)

The patch just adds an example to the README, this looks ok to me.

Matthew, I am confyused about one thing. Why would autovacuum count updates
while checking for analyze threshold? Analyze does not change statistics
right? ( w.r.t line 1072, pg_autovacuum.c). For updating statistics, only
inserts+deletes should suffice, isn't it?

An update is the equivelant of an insert and a delete, so it counts
towards the analyze count as much as an insert.

Other than that, I think autovacuum does everything it can.

It could be more customizable.

#11Josh Berkus
josh@agliodbs.com
In reply to: Matthew T. O'Connor (#7)
Re: [HACKERS] More detail on settings for pgavd?

Matthew,

For small tables, you don't need to vacuum too often. In the testing I
did a small table ~100 rows, didn't really show significant performance
degredation until it had close to 1000 updates.

This is accounted for by using the "threshold" value. That way small tables
get vacuumed less often. However, the way large tables work is very different
and I think your strategy shows a lack of testing on large active tables.

For large tables,
vacuum is so expensive, that you don't want to do it very often, and
scanning the whole table when there is only 5% wasted space is not very
helpful.

5% is probably too low, you're right ... in my experience, performance
degredation starts to set in a 10-15% updates to, for example, a 1.1 million
row table, particularly since users tend to request the most recently updated
rows. As long as we have the I/O issues that Background Writer and ARC are
intended to solve, though, I can see being less agressive on the defaults;
perhaps 20% or 25%. If you wait until 110% of a 1.1 million row table is
updated, though, that vaccuum will take an hour or more.

Additionally, you are not thinking of this in terms of an overall database
maintanence strategy. Lazy Vacuum needs to stay below the threshold of the
Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to
your databases. With proper configuration of pg_avd, vacuum_mem and FSM
values, it should be possible to never run a VACUUM FULL again, and as of 7.4
never run an REINDEX again either.

But this means running vacuum frequently enough that your max_fsm_pages
threshold is never reached. Which for a large database is going to have to
be more frequently than 110% updates, because setting 20,000,000
max_fsm_pages will eat your RAM.

Yes, the I set the defaults a little high perhaps so as to err on the
side of caution. I didn't want people to say pg_autovacuum kills the
performance of my server. A small table will get vacuumed, just not
until it has reached the threshold. So a table with 100 rows, will get
vacuumed after 1200 updates / deletes.

Ok, I can see that for small tables.

In my testing it showed that
there was no major performance problems until you reached several
thousand updates / deletes.

Sure. But several thousand updates can be only 2% of a very large table.

HUH? analyze is very very cheap compared to vacuum. Why not do it more
often?

Because nothing is cheap if it's not needed.

Analyze is needed only as often as the *aggregate distribution* of data in the
tables changes. Depending on the application, this could be frequently, but
far more often (in my experience running multiple databases for several
clients) the data distribution of very large tables changes very slowly over
time.

One client's database, for example, that I have running VACUUM on chron
scripts runs on this schedule for the main tables:
VACUUM only: twice per hour
VACUUM ANALYZE: twice per day

On the other hand, I've another client's database where most activity involves
updates to entire classes of records. They run ANALYZE at the end of every
transaction.

So if you're going to have a seperate ANALYZE schedule at all, it should be
slightly less frequent than VACUUM for large tables. Either that, or drop
the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead
of having 2 seperate schedules.

BUT .... now I see how you arrived at the logic you did. If you're testing
only on small tables, and not vacuuming them until they reach 110% updates,
then you *would* need to analyze more frequently. This is because of your
threshold value ... you'd want to analyze the small table as soon as even 30%
of its rows changed.

So the answer is to dramatically lower the threshold for the small tables.

What I think I am hearing is that people would like very much to be able
to tweak the settings of pg_autovacuum for individual tables / databases
etc.

Not from me you're not. Though that would be nice, too.

So, my suggested defaults based on our conversation above:

Vacuum threshold: 1000 records
Vacuum scale factor: 0.2
Analyze threshold: 50 records
Analyze scale factor: 0.3

--
Josh Berkus
Aglio Database Solutions
San Francisco

#12Josh Berkus
josh@agliodbs.com
In reply to: Shridhar Daithankar (#6)
Re: More detail on settings for pgavd?

Shridhar,

I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that
thresholds less than 1 should be used.

Yes, but not thresholds, scale factors of less than 1.0. Thresholds should
still be in the range of 100 to 1000.

I will submit a patch that would account deletes in analyze threshold.
Since you want to delay the analyze, I would calculate analyze count as

n=updates + inserts *-* deletes

I'm not clear on how this is a benefit. Deletes affect the statistics, too.

What does statistics gather BTW? Just number of rows or something else as
well? I think I would put that on Hackers separately.

Number of tuples, degree of uniqueness, some sample values, and high/low
values. Just query your pg_statistics view for an example.

I am still wary of inverting vacuum analyze frequency. You think it is
better to set inverted default rather than documenting it?

See my post to Matthew.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#13Chester Kustarz
chester@arbor.net
In reply to: Josh Berkus (#11)
Re: [HACKERS] More detail on settings for pgavd?

On Thu, 20 Nov 2003, Josh Berkus wrote:

Additionally, you are not thinking of this in terms of an overall database
maintanence strategy. Lazy Vacuum needs to stay below the threshold of the
Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to
your databases. With proper configuration of pg_avd, vacuum_mem and FSM
values, it should be possible to never run a VACUUM FULL again, and as of 7.4
never run an REINDEX again either.

is there any command you can run to see how much of the FSM is filled? is
there any way to tell which tables are filling it?

Analyze is needed only as often as the *aggregate distribution* of data in the
tables changes. Depending on the application, this could be frequently, but
far more often (in my experience running multiple databases for several
clients) the data distribution of very large tables changes very slowly over
time.

analyze does 2 things for me:
1. gets reasonable aggregate statistics
2. generates STATISTICS # of bins for the most frequent hitters

(2) is very important for me. my values typically seem to have power-law
like distributions. i need enough bins to reach a "cross-over" point where
the last bin is frequent enough to make an index scan useful. also,
i want enough bins so that the planner can choose index a or b for:
select * from foo where a=n and b=m;

the selectivity of either index depends not only on the average selectivity
of index a or index b, but on n and m as well. for example, 1M row table:

value % of rows
v1 23
v2 12
v3 4.5
v4 4
v5 3.5
...

you can see that picking an index for =v1 would be poor. picking the
20th most common value would be 0.5% selective. much better. of course
this breaks down for more complex operators, but = is fairly common.

So if you're going to have a seperate ANALYZE schedule at all, it should be
slightly less frequent than VACUUM for large tables. Either that, or drop
the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead
of having 2 seperate schedules.

i have some tables which are insert only. i do not want to vacuum them
because there are never any dead tuples in them and the vacuum grows the
indexes. plus it is very expensive (they tables grow rather large.) after they
expire i drop the whole table to make room for a newer one (making sort
of a rolling log with many large tables.)

i need to analyze them every so often so that the planner knows that
there is 1 row, 100 rows, 100k rows, 1M. the funny thing is
that because i never vacuum the tables, the relpages on the index never
grows. don't know if this affects anything (this is on 7.2.3).

vacuum is to reclaim dead tuples. this means it depends on update and
delete. analyze depends on data values/distribution. this means it depends on
insert, update, and delete. thus the dependencies are slightly different
between the 2 operations, an so you can come up with use-cases that
justify running either more frequently.

i am not sure how failed transactions fit into this though, not that i think
anybody ever has very many. maybe big rollbacks during testing?

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Chester Kustarz (#13)
Re: [HACKERS] More detail on settings for pgavd?

Chester Kustarz <chester@arbor.net> writes:

i have some tables which are insert only. i do not want to vacuum them
because there are never any dead tuples in them and the vacuum grows the
indexes.

Those claims cannot both be true. In any case, plain vacuum cannot grow
the indexes --- only a VACUUM FULL that moves a significant number of
rows could cause index growth.

vacuum is to reclaim dead tuples. this means it depends on update and
delete. analyze depends on data values/distribution. this means it depends on
insert, update, and delete. thus the dependencies are slightly different
between the 2 operations, an so you can come up with use-cases that
justify running either more frequently.

Agreed.

regards, tom lane

#15Chester Kustarz
chester@arbor.net
In reply to: Tom Lane (#14)
Re: [HACKERS] More detail on settings for pgavd?

On Thu, 20 Nov 2003, Tom Lane wrote:

Those claims cannot both be true. In any case, plain vacuum cannot grow
the indexes --- only a VACUUM FULL that moves a significant number of
rows could cause index growth.

er, yeah. you're right of course. having flashbacks of vacuum full.

#16Matthew T. O'Connor
matthew@zeut.net
In reply to: Tom Lane (#14)
Re: [HACKERS] More detail on settings for pgavd?

Tom Lane wrote:

Chester Kustarz <chester@arbor.net> writes:

vacuum is to reclaim dead tuples. this means it depends on update and
delete. analyze depends on data values/distribution. this means it depends on
insert, update, and delete. thus the dependencies are slightly different
between the 2 operations, an so you can come up with use-cases that
justify running either more frequently.

Agreed.

And that is why pg_autovacuum looks at insert, update and delete when
deciding to do an analyze, but only looks at update and delete when
deciding to do a vacuum. In addition, this is why pg_autovacuum was
given knobs so that the vacuum and analyze thresholds can be set
independently.

Matthew

#17Matthew T. O'Connor
matthew@zeut.net
In reply to: Josh Berkus (#11)
Re: [HACKERS] More detail on settings for pgavd?

Josh Berkus wrote:

Matthew,

For small tables, you don't need to vacuum too often. In the testing I
did a small table ~100 rows, didn't really show significant performance
degredation until it had close to 1000 updates.

This is accounted for by using the "threshold" value. That way small tables
get vacuumed less often. However, the way large tables work is very different
and I think your strategy shows a lack of testing on large active tables.

Probably more true than I would like to think...

For large tables,
vacuum is so expensive, that you don't want to do it very often, and
scanning the whole table when there is only 5% wasted space is not very
helpful.

5% is probably too low, you're right ... in my experience, performance
degredation starts to set in a 10-15% updates to, for example, a 1.1 million
row table, particularly since users tend to request the most recently updated
rows. As long as we have the I/O issues that Background Writer and ARC are
intended to solve, though, I can see being less agressive on the defaults;
perhaps 20% or 25%. If you wait until 110% of a 1.1 million row table is
updated, though, that vaccuum will take an hour or more.

True, but I think it would be one hour once, rather than 30 minutes 4 times.

Additionally, you are not thinking of this in terms of an overall database
maintanence strategy. Lazy Vacuum needs to stay below the threshold of the
Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to
your databases. With proper configuration of pg_avd, vacuum_mem and FSM
values, it should be possible to never run a VACUUM FULL again, and as of 7.4
never run an REINDEX again either.

This is one of the things I had hoped to add to pg_autovacuum, but never
got to. In addition to just the information from the stats collector on
inserts updates and deletes, pg_autovacuum should also look at the FSM,
and make decisions based on it. Anyone looking for a project?

But this means running vacuum frequently enough that your max_fsm_pages
threshold is never reached. Which for a large database is going to have to
be more frequently than 110% updates, because setting 20,000,000
max_fsm_pages will eat your RAM.

Again, the think the only way to do this efficiently is to look at the
FSM. Otherwise the only way to make sure you keep the FSM populated is
to run vacuum more than needed.

Yes, the I set the defaults a little high perhaps so as to err on the
side of caution. I didn't want people to say pg_autovacuum kills the
performance of my server. A small table will get vacuumed, just not
until it has reached the threshold. So a table with 100 rows, will get
vacuumed after 1200 updates / deletes.

Ok, I can see that for small tables.

In my testing it showed that
there was no major performance problems until you reached several
thousand updates / deletes.

Sure. But several thousand updates can be only 2% of a very large table.

But I can't imagine that 2% makes any difference on a large table. In
fact I would think that 10-15% would hardly be noticable, beyond that
I'm not sure.

HUH? analyze is very very cheap compared to vacuum. Why not do it more
often?

Because nothing is cheap if it's not needed.

Analyze is needed only as often as the *aggregate distribution* of data in the
tables changes. Depending on the application, this could be frequently, but
far more often (in my experience running multiple databases for several
clients) the data distribution of very large tables changes very slowly over
time.

Valid points, and again I think this points to the fact that
pg_autovacuum needs to be more configurable. Being able to set
different thresholds for different tables will help considerably. In
fact, you may find that some tables should have a vac threshold much
larger than the analyze thresold, while other tables might want the
opposite.

One client's database, for example, that I have running VACUUM on chron
scripts runs on this schedule for the main tables:
VACUUM only: twice per hour
VACUUM ANALYZE: twice per day

I would be surprized if you can notice the difference between a vacuum
analyze and a vacuum, especially on large tables.

On the other hand, I've another client's database where most activity involves
updates to entire classes of records. They run ANALYZE at the end of every
transaction.

So if you're going to have a seperate ANALYZE schedule at all, it should be
slightly less frequent than VACUUM for large tables. Either that, or drop
the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead
of having 2 seperate schedules.

I think you need two separate schedules. There are lots of times where
a vacuum doesn't help, and an analyze is all that is needed, and an
analyze is MUCH cheaper than a vacuum.

BUT .... now I see how you arrived at the logic you did. If you're testing
only on small tables, and not vacuuming them until they reach 110% updates,
then you *would* need to analyze more frequently. This is because of your
threshold value ... you'd want to analyze the small table as soon as even 30%
of its rows changed.

So the answer is to dramatically lower the threshold for the small tables.

Perhaps.

What I think I am hearing is that people would like very much to be able
to tweak the settings of pg_autovacuum for individual tables / databases
etc.

Not from me you're not. Though that would be nice, too.

So, my suggested defaults based on our conversation above:

Vacuum threshold: 1000 records
Vacuum scale factor: 0.2
Analyze threshold: 50 records
Analyze scale factor: 0.3

I'm open to discussion on changing the defaults. Perhaps what it would
be better to use some non-linear (perhaps logorithmic) scaling factor.
So that you wound up with something roughly like this:

#tuples activity% for vacuum
1k 100%
10k 70%
100k 45%
1M 20%
10M 10%
100M 8%

Thanks for the lucid feedback / discussion. autovacuum is a feature
that, despite it's simple implementation, has generated a lot of
feedback from users, and I would really like to see it become something
closer to what it should be.

#18Josh Berkus
josh@agliodbs.com
In reply to: Matthew T. O'Connor (#17)
Re: [HACKERS] More detail on settings for pgavd?

Matthew,

110% of a 1.1 million row table is updated, though, that vaccuum will
take an hour or more.

True, but I think it would be one hour once, rather than 30 minutes 4
times.

Well, generally it would be about 6-8 times at 2-4 minutes each.

This is one of the things I had hoped to add to pg_autovacuum, but never
got to. In addition to just the information from the stats collector on
inserts updates and deletes, pg_autovacuum should also look at the FSM,
and make decisions based on it. Anyone looking for a project?

Hmmm ... I think that's the wrong approach. Once your database is populated,
it's very easy to determine how to set the FSM for a given pg_avd level. If
you're vacuuming after 20% updates, for example, just set fsm_pages to 20% of
the total database pages plus growth & safety margins.

I'd be really reluctant to base pv-avd frequency on the fsm settings instead.
What if the user loads 8GB of data but leaves fsm_pages at the default of
10,000? You can't do much with that; you'd have to vacuum if even 1% of the
data changed.

The other problem is that calculating data pages from a count of
updates+deletes would require pg_avd to keep more statistics and do more math
for every table. Do we want to do this?

But I can't imagine that 2% makes any difference on a large table. In
fact I would think that 10-15% would hardly be noticable, beyond that
I'm not sure.

I've seen performance lag at 10% of records, especially in tables where both
update and select activity focus on one subset of the table (calendar tables,
for example).

Valid points, and again I think this points to the fact that
pg_autovacuum needs to be more configurable. Being able to set
different thresholds for different tables will help considerably. In
fact, you may find that some tables should have a vac threshold much
larger than the analyze thresold, while other tables might want the
opposite.

Sure. Though I think we can make the present configuration work with a little
adjustment of the numbers. I'll have a chance to test on production
databases soon.

I would be surprized if you can notice the difference between a vacuum
analyze and a vacuum, especially on large tables.

It's substantial for tables with high statistics settings. A 1,000,000 row
table with 5 columns set to statistics=250 can take 3 minutes to analyze on a
medium-grade server.

I think you need two separate schedules. There are lots of times where
a vacuum doesn't help, and an analyze is all that is needed

Agreed. And I've just talked to a client who may want to use pg_avd's ANALYZE
scheduling but not use vacuum at all. BTW, I think we should have a setting
for this; for example, if -V is -1, don't vacuum.

I'm open to discussion on changing the defaults. Perhaps what it would
be better to use some non-linear (perhaps logorithmic) scaling factor.
So that you wound up with something roughly like this:

#tuples activity% for vacuum
1k 100%
10k 70%
100k 45%
1M 20%
10M 10%
100M 8%

That would be cool, too. Though a count of data pages would be a better
scale than a count of rows, and equally obtainable from pg_class.

Thanks for the lucid feedback / discussion. autovacuum is a feature
that, despite it's simple implementation, has generated a lot of
feedback from users, and I would really like to see it become something
closer to what it should be.

Well, I hope to help now. Until very recently, I've not had a chance to
seriously look at pg_avd and test it in production. Now that I do, I'm
interested in improving it.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#19Robert Treat
xzilla@users.sourceforge.net
In reply to: Matthew T. O'Connor (#17)
Re: [HACKERS] More detail on settings for pgavd?

On Thu, 2003-11-20 at 19:40, Matthew T. O'Connor wrote:

I'm open to discussion on changing the defaults. Perhaps what it would
be better to use some non-linear (perhaps logorithmic) scaling factor.
So that you wound up with something roughly like this:

#tuples activity% for vacuum
1k 100%
10k 70%
100k 45%
1M 20%
10M 10%
100M 8%

Just thinking out loud here, so disregard if you think its chaff but...
if we had a system table pg_avd_defaults that held what we generally
consider the best default percentages based on reltuples/pages, and
added a column to pg_class (could be some place better but..) which
could hold an overriding percentage, you could then have a column added
to pg_stat_all_tables called vacuum_percentage, which would be a
coalesce of the override percentage or the default percentages based on
rel_tuples (or rel_pages). This would give autovacuum a place to look
for each table as to when it should vacuum, and gives administrators the
option to tweak it on a per table basis if they find they need a
specific table to vacuum at a different rate than the "standard".

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

#20Matthew T. O'Connor
matthew@zeut.net
In reply to: Robert Treat (#19)
Re: [HACKERS] More detail on settings for pgavd?

Robert Treat wrote:

Just thinking out loud here, so disregard if you think its chaff but...
if we had a system table pg_avd_defaults

[snip]

As long as pg_autovacuum remains a contrib module, I don't think any
changes to the system catelogs will be make. If pg_autovacuum is
deemed ready to move out of contrib, then we can talk about the above.

#21Matthew T. O'Connor
matthew@zeut.net
In reply to: Josh Berkus (#18)
#22Shridhar Daithankar
shridhar_daithankar@myrealbox.com
In reply to: Matthew T. O'Connor (#21)
#23Matthew T. O'Connor
matthew@zeut.net
In reply to: Shridhar Daithankar (#22)
#24Josh Berkus
josh@agliodbs.com
In reply to: Matthew T. O'Connor (#20)
#25Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
#26Matthew T. O'Connor
matthew@zeut.net
In reply to: Josh Berkus (#24)
#27Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
#28Matthew T. O'Connor
matthew@zeut.net
In reply to: Josh Berkus (#25)
#29Matthew T. O'Connor
matthew@zeut.net
In reply to: Josh Berkus (#27)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#25)
#31Chester Kustarz
chester@arbor.net
In reply to: Matthew T. O'Connor (#28)
#32Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#30)
#33Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Matthew T. O'Connor (#26)
#34Chris Browne
cbbrowne@acm.org
In reply to: Josh Berkus (#1)
#35Andrew Sullivan
andrew@libertyrms.info
In reply to: Bruce Momjian (#32)