Re: CPU-intensive autovacuuming

Started by Phil Endecottover 20 years ago111 messages
#1Phil Endecott
spam_from_postgresql_general@chezphil.org

Following up on my own post from last night:

Could it be that there is some code in autovacuum that is O(n^2) in
the number of tables?

Browsing the code using webcvs, I have found this:

for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{

I haven't really tried to understand what is going on in here, but it
does look like it is getting the result of the "pg_class join stats"
query and then matching it up against its internal list of tables using
nested loops, which is undoubtedly O(n^2) in the number of tables.

Have I correctly understood what is going on here?

--Phil.

#2Matthew T. O'Connor
matthew@zeut.net
In reply to: Phil Endecott (#1)

Phil Endecott wrote:

Following up on my own post from last night:

Could it be that there is some code in autovacuum that is O(n^2) in
the number of tables?

Browsing the code using webcvs, I have found this:

for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{

I haven't really tried to understand what is going on in here, but it
does look like it is getting the result of the "pg_class join stats"
query and then matching it up against its internal list of tables
using nested loops, which is undoubtedly O(n^2) in the number of tables.

Have I correctly understood what is going on here?

Indeed you have. I have head a few similar reports but perhaps none as
bad as yours. One person put a small sleep value so that it doesn't
spin so tight. You could also just up the sleep delay so that it
doesn't do this work quite so often. No other quick suggestions.

#3Phil Endecott
spam_from_postgresql_general@chezphil.org
In reply to: Matthew T. O'Connor (#2)

Matthew T. O'Connor wrote:

Phil Endecott wrote:

Could it be that there is some code in autovacuum that is O(n^2) in
the number of tables?

Browsing the code using webcvs, I have found this:

for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{
Have I correctly understood what is going on here?

Indeed you have. I have head a few similar reports but perhaps none as
bad as yours. One person put a small sleep value so that it doesn't
spin so tight. You could also just up the sleep delay so that it
doesn't do this work quite so often. No other quick suggestions.

I do wonder why autovacuum is keeping its table list in memory rather
than in the database.

But given that it is keeping it in memory, I think the real fix is to
sort that list (or keep it ordered when building or updating it). It is
trivial to also get the query results ordered, and they can then be
compared in O(n) time.

I notice various other places where there seem to be nested loops, e.g.
in the update_table_list function. I'm not sure if they can be fixed by
similar means.

--Phil.

#4Matthew T. O'Connor
matthew@zeut.net
In reply to: Phil Endecott (#3)

Phil Endecott wrote:

Matthew T. O'Connor wrote:

Indeed you have. I have head a few similar reports but perhaps none
as bad as yours. One person put a small sleep value so that it
doesn't spin so tight. You could also just up the sleep delay so
that it doesn't do this work quite so often. No other quick
suggestions.

I do wonder why autovacuum is keeping its table list in memory rather
than in the database.

For better or worse, this was a conscious design decision that the
contrib version of autovacuum be non-invasive to your database.

But given that it is keeping it in memory, I think the real fix is to
sort that list (or keep it ordered when building or updating it). It
is trivial to also get the query results ordered, and they can then be
compared in O(n) time.

I'm quite sure there is a better way, please submit a patch if you can.
This was never a real concern for most people since the number of tables
is typically small enough not to be a problem. The integrated version
of autovacuum that didn't make the cut before 8.0 avoids this problem
since the autovacuum data is stored in the database.

I notice various other places where there seem to be nested loops,
e.g. in the update_table_list function. I'm not sure if they can be
fixed by similar means.

I would think so, they all basically do the same type of loop.

#5Phil Endecott
spam_from_postgresql_general@chezphil.org
In reply to: Matthew T. O'Connor (#4)

Matthew T. O'Connor wrote:

The integrated version
of autovacuum that didn't make the cut before 8.0 avoids this problem
since the autovacuum data is stored in the database.

What is the status of this? Is it something that will be included in
8.1 or 8.0.n? I might be able to patch the current code but that
doesn't seem like a useful thing to do if a better solution will arrive
eventually. I am currently running vacuums from a cron job and I think
I will be happy with that for the time being.

(Incidentally, I have also found that the indexes on my pg_attributes
table were taking up over half a gigabyte, which came down to less than
40 megs after reindexing them. Is there a case for having autovacuum
also call reindex?)

--Phil.

#6Matthew T. O'Connor
matthew@zeut.net
In reply to: Phil Endecott (#5)

Phil Endecott wrote:

Matthew T. O'Connor wrote:

The integrated version of autovacuum that didn't make the cut before
8.0 avoids this problem since the autovacuum data is stored in the
database.

What is the status of this? Is it something that will be included in
8.1 or 8.0.n? I might be able to patch the current code but that
doesn't seem like a useful thing to do if a better solution will
arrive eventually. I am currently running vacuums from a cron job and
I think I will be happy with that for the time being.

This is a good question :-) I have been so busy with work lately that I
have not been able to work on it. I am currently trying to resurrect
the patch I sent in for 8.0 and update it so that it applies against
HEAD. Once that is done, I will need help from someone with the
portions of the work that I'm not comfortable / capable of. The main
issue with the version I created during the 8.0 devel cycle it used
libpq to connect, query and issue commands against the databases. This
was deemed bad, and I need help setting up the infrastructure to make
this happen without libpq. I hope to have my patch applying against
HEAD sometime this week but it probably won't happen till next week.

So the summary of the autovacuum integration status is that we are fast
running out of time (feature freeze July 1), and I have very little time
to devote to this task. So you might want to submit your O(n) patch
cause unfortunately it looks like integrated autovacuum might slip
another release unless someone else steps up to work on it.

(Incidentally, I have also found that the indexes on my pg_attributes
table were taking up over half a gigabyte, which came down to less
than 40 megs after reindexing them. Is there a case for having
autovacuum also call reindex?)

Yes there is certainly some merit to having autovacuum or something
similar perform other system maintenance tasks such as reindexing. I
just haven't taken it there yet. It does seem strange that your
pg_attributes table go that big, anyone have any insight here? You did
say you are using 7.4.2, I forget it that has the index reclaiming code
in vacuum, also there are some autovacuum bugs in the early 7.4.x
releases. You might try to upgrade to either 8.0.x or a later 7.4.x
release.

Matthew O'Connor

#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Phil Endecott (#5)

Phil Endecott wrote:

Matthew T. O'Connor wrote:

The integrated version
of autovacuum that didn't make the cut before 8.0 avoids this problem
since the autovacuum data is stored in the database.

What is the status of this? Is it something that will be included in
8.1 or 8.0.n? I might be able to patch the current code but that
doesn't seem like a useful thing to do if a better solution will arrive
eventually. I am currently running vacuums from a cron job and I think
I will be happy with that for the time being.

I will post about integrating pg_autovacuum into the backend for 8.1 in
a few minutes.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Phil Endecott (#5)

Phil Endecott <spam_from_postgresql_general@chezphil.org> writes:

(Incidentally, I have also found that the indexes on my pg_attributes
table were taking up over half a gigabyte, which came down to less than
40 megs after reindexing them. Is there a case for having autovacuum
also call reindex?)

Lots of temp tables I suppose? If so that's not autovacuum's fault;
it wasn't getting told about the activity in pg_attribute until this
patch:

2005-03-31 18:20 tgl

* src/backend/postmaster/: pgstat.c (REL7_4_STABLE), pgstat.c
(REL8_0_STABLE), pgstat.c: Flush any remaining statistics counts
out to the collector at process exit. Without this, operations
triggered during backend exit (such as temp table deletions) won't
be counted ... which given heavy usage of temp tables can lead to
pg_autovacuum falling way behind on the need to vacuum pg_class and
pg_attribute. Per reports from Steve Crawford and others.

Unless the bloat occurred after you updated to 8.0.2, there's no issue.

regards, tom lane

#9Thomas F. O'Connell
tfo@sitening.com
In reply to: Phil Endecott (#3)

Phil,

If you complete this patch, I'm very interested to see it.

I think I'm the person Matthew is talking about who inserted a sleep
value. Because of the sheer number of tables involved, even small
values of sleep caused pg_autovacuum to iterate too slowly over its
table lists to be of use in a production environment (where I still
find its behavior to be preferable to a complicated list of manual
vacuums performed in cron).

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Jun 7, 2005, at 6:16 AM, Phil Endecott wrote:

Show quoted text

Matthew T. O'Connor wrote:

Phil Endecott wrote:

Could it be that there is some code in autovacuum that is O

(n^2) in

the number of tables?

Browsing the code using webcvs, I have found this:

for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{ Have I correctly understood what is going on here?

Indeed you have. I have head a few similar reports but perhaps
none as bad as yours. One person put a small sleep value so that
it doesn't spin so tight. You could also just up the sleep delay
so that it doesn't do this work quite so often. No other quick
suggestions.

I do wonder why autovacuum is keeping its table list in memory
rather than in the database.

But given that it is keeping it in memory, I think the real fix is
to sort that list (or keep it ordered when building or updating
it). It is trivial to also get the query results ordered, and they
can then be compared in O(n) time.

I notice various other places where there seem to be nested loops,
e.g. in the update_table_list function. I'm not sure if they can
be fixed by similar means.

--Phil.

#10Shelby Cain
alyandon@yahoo.com
In reply to: Thomas F. O'Connell (#9)
--- "Thomas F. O'Connell" <tfo@sitening.com> wrote:

Phil,

If you complete this patch, I'm very interested to see it.

I think I'm the person Matthew is talking about who inserted a sleep

value. Because of the sheer number of tables involved, even small
values of sleep caused pg_autovacuum to iterate too slowly over its
table lists to be of use in a production environment (where I still
find its behavior to be preferable to a complicated list of manual
vacuums performed in cron).

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Were you sleeping every time through the loop? How about something
like:

if (j%500 == 1) usleep(100000)

Regards,

Shelby Cain

__________________________________
Discover Yahoo!
Stay in touch with email, IM, photo sharing and more. Check it out!
http://discover.yahoo.com/stayintouch.html

#11Thomas F. O'Connell
tfo@sitening.com
In reply to: Shelby Cain (#10)

I was usleeping in tiny increments in each iteration of the loop. I
didn't try break it into iterative groups like this.

Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather
than O(n^2) table activity. At this point, though, I'm probably not
too likely to have much time to hack pg_autovacuum before 8.1 is
released, although if it doesn't become integrated by beta feature
freeze, I might give it a shot.

But I hope if anyone completes the linear improvement, they'll post
to the lists.

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Jun 10, 2005, at 9:12 AM, Shelby Cain wrote:

Show quoted text
--- "Thomas F. O'Connell" <tfo@sitening.com> wrote:

Were you sleeping every time through the loop? How about something
like:

if (j%500 == 1) usleep(100000)

Regards,

Shelby Cain

#12Hugo
htakada@gmail.com
In reply to: Thomas F. O'Connell (#11)
how to return a result set from a stored procedure

Hi everybody

I am trying to write a stored procedure that returns a result set but it is
not working
this is the function:
///
CREATE OR REPLACE FUNCTION
remisiones.fn_get_total_remitidoxprovision1("numeric")

RETURNS SETOF record AS
$BODY$
begin
select rm.provision as provision,
drm.producto as producto,
sum(drm.cantidad) as cantidad
FROM remisiones.remisiones rm, remisiones.detalles_remision drm
WHERE rm.remision = drm.remision and rm.provision = $1
GROUP BY rm.provision, drm.producto
ORDER BY rm.provision, drm.producto;
end;$BODY$

///
If I call this function from the interactive sql of pgadminIII I get this
result:
select * from fn_gert_total_remitidosxprovision(1)
---------------------------------------------------------------------------
row refcursor
1 <unnamed porta1>

is there a way to display the value of the rows returned, i need it becouse
I need to use it in a Datawindow definition in an Powerbuilder app.

thanks in advance

Hugo

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas F. O'Connell (#11)

"Thomas F. O'Connell" <tfo@sitening.com> writes:

Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather
than O(n^2) table activity. At this point, though, I'm probably not
too likely to have much time to hack pg_autovacuum before 8.1 is
released, although if it doesn't become integrated by beta feature
freeze, I might give it a shot.

This would be vastly easier to fix if the code were integrated into the
backend first. In the backend environment you could just keep the info
in a dynahash.c hashtable instead of in a linear list. On the client
side, you have to roll your own hashing (or adapt dynahash to life
outside the backend environment).

regards, tom lane

#14Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#13)
Autovacuum in the backend

One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:

o move it into the backend and have it start/stop automatically
o move the autovacuum configuration parameters into postgresql.conf
o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashes

Who would like to get started on this? It seems pretty straight-forward.

---------------------------------------------------------------------------

Tom Lane wrote:

"Thomas F. O'Connell" <tfo@sitening.com> writes:

Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather
than O(n^2) table activity. At this point, though, I'm probably not
too likely to have much time to hack pg_autovacuum before 8.1 is
released, although if it doesn't become integrated by beta feature
freeze, I might give it a shot.

This would be vastly easier to fix if the code were integrated into the
backend first. In the backend environment you could just keep the info
in a dynahash.c hashtable instead of in a linear list. On the client
side, you have to roll your own hashing (or adapt dynahash to life
outside the backend environment).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#14)
Re: Autovacuum in the backend

Bruce Momjian <pgman@candle.pha.pa.us> writes:

One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:

o move it into the backend and have it start/stop automatically
o move the autovacuum configuration parameters into postgresql.conf
o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashes

Who would like to get started on this? It seems pretty straight-forward.

A small problem here is that until you get at least to step 3
(backend-standard error handling), none of it is going to be acceptable
to commit. So I don't entirely buy Bruce's notion of bite-size pieces
of work. You can certainly work on it in that fashion, but it's not
going into 8.1 unless most of the above is done by the end of the month.

regards, tom lane

#16Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#13)
Re: Autovacuum in the backend

"Bruce Momjian" <pgman@candle.pha.pa.us> writes

One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:

o move it into the backend and have it start/stop automatically

The start/stop routine is quite like Bgwriter. It requires PgStats to be
turned on. If it aborts unexpectedly, hopefully we could restart it. Shall
we have a RequestVacuum() to pass the control to this process so to avoid
possible redundant vacuums from user side?

o move the autovacuum configuration parameters into postgresql.conf

There are some correlations of GUC parameters in order to incorporate it:
* stats_start_collector = true
* stats_row_level = true

add a parameter to let user pass in the configuration parameters:
* autovacuum_command = "-s 100 -S 1 ..."

So if autovacuum_command is given, we will automatically set the upper two
parameters true?

o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashes

Change "connect/disconnect server" to "start/stop autovacuum process";
Change "execute query" to "backend APIs";
Change "list" to "hash";
Need think more to handle various error conditions ...

Who would like to get started on this? It seems pretty straight-forward.

I'd like to give it a try.

Regards,
Qingqing

#17Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Bruce Momjian (#14)
Re: Autovacuum in the backend

"Tom Lane" <tgl@sss.pgh.pa.us> writes

A small problem here is that until you get at least to step 3
(backend-standard error handling), none of it is going to be acceptable
to commit. So I don't entirely buy Bruce's notion of bite-size pieces
of work. You can certainly work on it in that fashion, but it's not
going into 8.1 unless most of the above is done by the end of the month.

Scared ...

Regards,
Qingqing

#18Neil Conway
neilc@samurai.com
In reply to: Qingqing Zhou (#16)
Re: Autovacuum in the backend

Qingqing Zhou wrote:

The start/stop routine is quite like Bgwriter. It requires PgStats to be
turned on.

Wasn't the plan to rewrite pg_autovacuum to use the FSM rather than the
stats collector?

-Neil

#19Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#13)
Re: Autovacuum in the backend

"Neil Conway" <neilc@samurai.com> writes

Wasn't the plan to rewrite pg_autovacuum to use the FSM rather than the
stats collector?

I don't understand. Currently the basic logic of pg_autovacuum is to use the
pg_stat_all_tables numbers like n_tup_upd, n_tup_del to determine if a
relation need to be vacuumed. How to use FSM to get these information?

Regards,
Qingqing

#20Hannu Krosing
hannu@skype.net
In reply to: Bruce Momjian (#14)
Re: Autovacuum in the backend

On T, 2005-06-14 at 21:23 -0400, Bruce Momjian wrote:

One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:

o move it into the backend and have it start/stop automatically
o move the autovacuum configuration parameters into postgresql.conf
o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashes

Who would like to get started on this? It seems pretty straight-forward.

Can autovacuum yet be configured _not_ to run vacuum during some hours
or above some load ?

Even better - to stop or pause a long-running vacuum if load goes above
some limit.

If it goes into backend before the above is done, it should at least be
possible to switch it off completely.

--
Hannu Krosing <hannu@skype.net>

#21Hannu Krosing
hannu@skype.net
In reply to: Qingqing Zhou (#19)
Re: Autovacuum in the backend

On K, 2005-06-15 at 13:41 +0800, Qingqing Zhou wrote:

"Neil Conway" <neilc@samurai.com> writes

Wasn't the plan to rewrite pg_autovacuum to use the FSM rather than the
stats collector?

I don't understand. Currently the basic logic of pg_autovacuum is to use the
pg_stat_all_tables numbers like n_tup_upd, n_tup_del to determine if a
relation need to be vacuumed. How to use FSM to get these information?

One can't probably use FSM as it is, as FSM is filled in by vacuum and
this creates a circular dependency.

But it would be very nice to have something _similar_ to FSM, say DSM
(dead space map), which is filled in when a tuple is marked as "dead for
all running backends", which could be used to implement a vacuum which
vacuums only those pages, which do actually contain removable tuples.

--
Hannu Krosing <hannu@skype.net>

#22Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#13)
Re: Autovacuum in the backend

"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes

The start/stop routine is quite like Bgwriter.

I just realized that the non-standard backend can't execute any SQL command.
If so, how would the background pg_autovacuum do "vacuum"? It could be
forked more like a standard backend, but this is obviously not a good idea,
since we don't know which database this process will sit on.

A possible solution is that backgroud pg_autovacuum could fork another
process to connect to postmaster as an ordinary backend each time it feels
that a "vacuum" is needed.

Any ideas?

Regards,
Qingqing

#23Roman Neuhauser
neuhauser@sigpipe.cz
In reply to: Hugo (#12)
Re: how to return a result set from a stored procedure

# htakada@gmail.com / 2005-06-11 08:44:32 -0400:

Hi everybody

I am trying to write a stored procedure that returns a result set but it is
not working
this is the function:
///
CREATE OR REPLACE FUNCTION
remisiones.fn_get_total_remitidoxprovision1("numeric")

RETURNS SETOF record AS
$BODY$
begin
select rm.provision as provision,
drm.producto as producto,
sum(drm.cantidad) as cantidad
FROM remisiones.remisiones rm, remisiones.detalles_remision drm
WHERE rm.remision = drm.remision and rm.provision = $1
GROUP BY rm.provision, drm.producto
ORDER BY rm.provision, drm.producto;
end;$BODY$

///

http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#AEN32875

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man. You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

#24Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Hannu Krosing (#21)
Re: Autovacuum in the backend

Hannu Krosing wrote:

On K, 2005-06-15 at 13:41 +0800, Qingqing Zhou wrote:

"Neil Conway" <neilc@samurai.com> writes

Wasn't the plan to rewrite pg_autovacuum to use the FSM rather than the
stats collector?

I don't understand. Currently the basic logic of pg_autovacuum is to use the
pg_stat_all_tables numbers like n_tup_upd, n_tup_del to determine if a
relation need to be vacuumed. How to use FSM to get these information?

One can't probably use FSM as it is, as FSM is filled in by vacuum and
this creates a circular dependency.

But it would be very nice to have something _similar_ to FSM, say DSM
(dead space map), which is filled in when a tuple is marked as "dead for
all running backends", which could be used to implement a vacuum which
vacuums only those pages, which do actually contain removable tuples.

Yes, those are step five. The TODO list has:

* Auto-vacuum
o Move into the backend code
o Scan the buffer cache to find free space or use background writer
o Use free-space map information to guide refilling
o Do VACUUM FULL if table is nearly empty?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#25Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Qingqing Zhou (#22)
Re: Autovacuum in the backend

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

---------------------------------------------------------------------------

Qingqing Zhou wrote:

"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes

The start/stop routine is quite like Bgwriter.

I just realized that the non-standard backend can't execute any SQL command.
If so, how would the background pg_autovacuum do "vacuum"? It could be
forked more like a standard backend, but this is obviously not a good idea,
since we don't know which database this process will sit on.

A possible solution is that backgroud pg_autovacuum could fork another
process to connect to postmaster as an ordinary backend each time it feels
that a "vacuum" is needed.

Any ideas?

Regards,
Qingqing

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#26Alvaro Herrera
alvherre@surnet.cl
In reply to: Qingqing Zhou (#16)
Re: Autovacuum in the backend

On Wed, Jun 15, 2005 at 11:23:20AM +0800, Qingqing Zhou wrote:

Who would like to get started on this? It seems pretty straight-forward.

I'd like to give it a try.

I'm on it. I have Matthew's patch, updated to current sources, and I'm
working on cleaning it up to address all known concerns. I expect to be
able to have something for patches early next week, which can be
discussed.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Postgres is bloatware by design: it was built to house
PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)

#27Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Alvaro Herrera (#26)
Re: Autovacuum in the backend

Alvaro Herrera wrote:

On Wed, Jun 15, 2005 at 11:23:20AM +0800, Qingqing Zhou wrote:

Who would like to get started on this? It seems pretty straight-forward.

I'd like to give it a try.

I'm on it. I have Matthew's patch, updated to current sources, and I'm
working on cleaning it up to address all known concerns. I expect to be
able to have something for patches early next week, which can be
discussed.

Oh, excellent. Thanks. Please look at the patch I just applied to
pg_autovacuum today and merge that into what you have. Great!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#28Josh Berkus
josh@agliodbs.com
In reply to: Qingqing Zhou (#16)
Re: Autovacuum in the backend

Qingqing,

add a parameter to let user pass in the configuration parameters:
* autovacuum_command = "-s 100 -S 1 ..."

um, can we have these as separate GUCs and not lumped together as a string?
i.e.:
autovacuum_frequency = 60 #seconds, 0 = disable
autovacuum_vacuum_threshold = 200
autovacuum_vacuum_multiple = 0.5
autovacuum_analyze_threshold = 100
autovacuum_analyze_multiple = 0.4

AV should be disabled by default. It should also automatically use the global
vacuum_delay settings.

But it would be very nice to have something _similar_ to FSM, say DSM
(dead space map), which is filled in when a tuple is marked as "dead for
all running backends", which could be used to implement a vacuum which
vacuums only those pages, which do actually contain removable tuples.

Speaking of FSM improvements, it would be **really** useful to have a pg_stats
view that let you know how full the FSM was, overall. something like:
pg_stats_fsm_usage
fsm_relations fsm_relations_used fsm_pages fsm_pages_used
1000 312 200000 11579

This would allow for other schemes of vacuum automation.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#29Matthew T. O'Connor
matthew@zeut.net
In reply to: Bruce Momjian (#24)
Re: Autovacuum in the backend

Bruce Momjian wrote:

Hannu Krosing wrote:

On K, 2005-06-15 at 13:41 +0800, Qingqing Zhou wrote:

"Neil Conway" <neilc@samurai.com> writes

Wasn't the plan to rewrite pg_autovacuum to use the FSM rather than the
stats collector?

Yes, those are step five. The TODO list has:

* Auto-vacuum
o Move into the backend code
o Scan the buffer cache to find free space or use background writer
o Use free-space map information to guide refilling
o Do VACUUM FULL if table is nearly empty?

I think the two can be used in conjunction with one another, and perhaps
one day it could be used with or without the stats system. Integrating
FSM information has to be helpful, but it also isn't going to tell us
when to do an analyze due to lots of inserts, also the FSM (if not big
enough) is lossy and might not be tracking all the tables. So I think
for 8.1 if it only used stats that would be OK.

#30Matthew T. O'Connor
matthew@zeut.net
In reply to: Bruce Momjian (#25)
Re: Autovacuum in the backend

Bruce Momjian wrote:

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

It is a big question, at least it's the main stumbling block I had, and
it's the one that kept my work from being integrated into 8.0 (which
side stepped the issue by using libpq to connect to the server to fire
off commands).

Show quoted text

---------------------------------------------------------------------------

Qingqing Zhou wrote:

I just realized that the non-standard backend can't execute any SQL command.
If so, how would the background pg_autovacuum do "vacuum"? It could be
forked more like a standard backend, but this is obviously not a good idea,
since we don't know which database this process will sit on.

A possible solution is that backgroud pg_autovacuum could fork another
process to connect to postmaster as an ordinary backend each time it feels
that a "vacuum" is needed.

#31Matthew T. O'Connor
matthew@zeut.net
In reply to: Josh Berkus (#28)
Re: Autovacuum in the backend

Josh Berkus wrote:

Qingqing,

add a parameter to let user pass in the configuration parameters:
* autovacuum_command = "-s 100 -S 1 ..."

um, can we have these as separate GUCs and not lumped together as a string?
i.e.:
autovacuum_frequency = 60 #seconds, 0 = disable
autovacuum_vacuum_threshold = 200
autovacuum_vacuum_multiple = 0.5
autovacuum_analyze_threshold = 100
autovacuum_analyze_multiple = 0.4

AV should be disabled by default. It should also automatically use the global
vacuum_delay settings.

Agreed, disabled by default (at least for 8.1, perhaps a topic of
conversation for 8.2+), yes it should obey the global vacuum_delay
settings, and yes it should have it's own GUC's as you suggested (all of
this was the case with the patch that I submitted for 8.0, which Alvarro
is now working on).

But it would be very nice to have something _similar_ to FSM, say DSM
(dead space map), which is filled in when a tuple is marked as "dead for
all running backends", which could be used to implement a vacuum which
vacuums only those pages, which do actually contain removable tuples.

Speaking of FSM improvements, it would be **really** useful to have a pg_stats
view that let you know how full the FSM was, overall. something like:
pg_stats_fsm_usage
fsm_relations fsm_relations_used fsm_pages fsm_pages_used
1000 312 200000 11579

This would allow for other schemes of vacuum automation.

Interesting, perhaps if FSM data is exported to the stats system
autovacuum could use that. What might be best is both a view that
showed overall FSM information, but then also export FSM information on
a per table basis, perhaps as additional columns added to existing stats
tables.

#32Matthew T. O'Connor
matthew@zeut.net
In reply to: Hannu Krosing (#20)
Re: Autovacuum in the backend

Hannu Krosing wrote:

Can autovacuum yet be configured _not_ to run vacuum during some hours
or above some load ?

That is certainly a goal, hopefully it will get done for 8.1. The
actual design I had in mind (based on prior discussion on hackers) is to
allow a maintenance window that would have lower vacuum thresholds, this
way only the tables that really need it will get vacuumed during the day.

Even better - to stop or pause a long-running vacuum if load goes above
some limit.

I don't think the current implementation if VACUUM can support that. I
believe that all the work will get rolled back if gets canceled.

Perhaps a decent solution would be to have autovacuum increase the
vacuum delay settings dynamically based on system load average. That
way if a vacuum starts and the system starts to get busy, the autoavcuum
daemon can increase the vacuum delay settings and VACUUM would honor
this while running.

If it goes into backend before the above is done, it should at least be
possible to switch it off completely.

Absolutely, in fact it will not only have the option to turn it off, it
will be off by default.

#33Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Matthew T. O'Connor (#32)
Re: Autovacuum in the backend

Matthew T. O'Connor wrote:

I don't think the current implementation if VACUUM can support that. I
believe that all the work will get rolled back if gets canceled.

Perhaps a decent solution would be to have autovacuum increase the
vacuum delay settings dynamically based on system load average. That
way if a vacuum starts and the system starts to get busy, the autoavcuum
daemon can increase the vacuum delay settings and VACUUM would honor
this while running.

I would like to have the GUC variables be honored while the system is
running, and that would all administrators to make changes from scripts.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#34Gavin Sherry
swm@linuxworld.com.au
In reply to: Bruce Momjian (#25)
Re: Autovacuum in the backend

On Wed, 15 Jun 2005, Bruce Momjian wrote:

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

I'm wondering if effort is being misdirected here. I remember when Mark
Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
significant performance loss -- I think on the order of 30% to 40% (I will
try and dig up a link to the results).

I think these results can be dramatically improved if the focus is on a
more effective vacuum.

In January I was in Toronto with Jan, Tom and others and some ideas about
vacuum were being discussed. The basic idea is that when we dirty pages we
need we set a bit in a bitmap to say that the page has been dirty. A
convenient place to do this is when we are writing dirty buffers out to
disk. In many situations, this can happen inside the bgwriter meaning that
there should be little contention for this bitmap. Of course, individual
backends may be writing pages out and would have to account for the
dirty pages at that point.

Now this bitmap can occur on a per heap segment basis (ie, per 1 GB heap
file). You only need 2 pages for the bitmap to represent all the pages in
the segment, which is fairly nice. When vacuum is run, instead of visiting
every page, it would see which pages have been dirtied in the bitmap and
visit only pages. With large tables and small numbers of modified
tuples/pages, the effect this change would have would be pretty
impressive.

This also means that we could effectively implement some of the ideas
which are being floated around, such as having vacuum run only for a short
time period.

One problem is whether or not we have to guarantee that we account for
every dirtied page. I think that would be difficult in the presence of a
crash. One idea Neil mentioned is that on a crash, we could set all pages
in the bitmap to dirty and the first vacuum would effectively be a vacuum
full. The alternative is to say that we don't guarantee that this type of
vacuum is completely comprehensive and that it isn't a replacement for
vacuum full.

Thoughts? Comments?

Thanks,

Gavin

#35Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Matthew T. O'Connor (#31)
Re: Autovacuum in the backend

um, can we have these as separate GUCs and not lumped together as a
string? i.e.:
autovacuum_frequency = 60 #seconds, 0 = disable
autovacuum_vacuum_threshold = 200
autovacuum_vacuum_multiple = 0.5
autovacuum_analyze_threshold = 100
autovacuum_analyze_multiple = 0.4

AV should be disabled by default. It should also automatically use
the global vacuum_delay settings.

Agreed, disabled by default (at least for 8.1, perhaps a topic of
conversation for 8.2+), yes it should obey the global vacuum_delay
settings, and yes it should have it's own GUC's as you suggested (all of
this was the case with the patch that I submitted for 8.0, which Alvarro
is now working on).

I think it should be on by default :)

Let's not ship software in a default configuration that we KNOW will go
to hell.

How about we aim to make that if someone doesn't modify their
postgresql.conf, they will have no problems. Wasn't that the aim of
defaulting shared_buffers to 1000 if we can?

Chris

#36Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Gavin Sherry (#34)
Re: Autovacuum in the backend

I totally agree. I think we know pg_autovacuum is just a short-term
solution. What we need is for someone to take the lead in this.

The TODO list has the ideas documented. There is no reason the
background writer could not load the FSM directly with free pages.

---------------------------------------------------------------------------

Gavin Sherry wrote:

On Wed, 15 Jun 2005, Bruce Momjian wrote:

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

I'm wondering if effort is being misdirected here. I remember when Mark
Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
significant performance loss -- I think on the order of 30% to 40% (I will
try and dig up a link to the results).

I think these results can be dramatically improved if the focus is on a
more effective vacuum.

In January I was in Toronto with Jan, Tom and others and some ideas about
vacuum were being discussed. The basic idea is that when we dirty pages we
need we set a bit in a bitmap to say that the page has been dirty. A
convenient place to do this is when we are writing dirty buffers out to
disk. In many situations, this can happen inside the bgwriter meaning that
there should be little contention for this bitmap. Of course, individual
backends may be writing pages out and would have to account for the
dirty pages at that point.

Now this bitmap can occur on a per heap segment basis (ie, per 1 GB heap
file). You only need 2 pages for the bitmap to represent all the pages in
the segment, which is fairly nice. When vacuum is run, instead of visiting
every page, it would see which pages have been dirtied in the bitmap and
visit only pages. With large tables and small numbers of modified
tuples/pages, the effect this change would have would be pretty
impressive.

This also means that we could effectively implement some of the ideas
which are being floated around, such as having vacuum run only for a short
time period.

One problem is whether or not we have to guarantee that we account for
every dirtied page. I think that would be difficult in the presence of a
crash. One idea Neil mentioned is that on a crash, we could set all pages
in the bitmap to dirty and the first vacuum would effectively be a vacuum
full. The alternative is to say that we don't guarantee that this type of
vacuum is completely comprehensive and that it isn't a replacement for
vacuum full.

Thoughts? Comments?

Thanks,

Gavin

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#37Alvaro Herrera
alvherre@surnet.cl
In reply to: Matthew T. O'Connor (#31)
Re: Autovacuum in the backend

A question for interested parties. I'm thinking in handling the
user/password issue by reading the flat files (the copies of pg_shadow,
pg_database, etc).

The only thing that I'd need to modify is add the datdba field to
pg_database, so we can figure out an appropiate user for vacuuming each
database.

What do people think?

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Acepta los honores y aplausos y perder�s tu libertad"

#38Alvaro Herrera
alvherre@surnet.cl
In reply to: Gavin Sherry (#34)
Re: Autovacuum in the backend

On Thu, Jun 16, 2005 at 11:07:31AM +1000, Gavin Sherry wrote:

On Wed, 15 Jun 2005, Bruce Momjian wrote:

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

I'm wondering if effort is being misdirected here. I remember when Mark
Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
significant performance loss -- I think on the order of 30% to 40% (I will
try and dig up a link to the results).

I think those are orthogonal issues. One is fixing whatever performance
issues there are because of VACUUM. Note that the fact that Mark was
having such a drop in performance with autovacuum does only mean that
at the enormous load under which the OSDL tests are run, autovacuum is
not the best solution. Not everybody runs with that sort of load
anyway. (In fact lots of people don't.)

So, one issue is that at high loads, there are improvements to be made
to VACUUM. The other issue is to get VACUUM to run in the first place,
which is what autovacuum addresses.

I can easily predict that we will make adjustments and improvements to
VACUUM in the future, but I'm not so sure if it will happen before 8.1
feature-freezes. I have more confidence that we can integrate
autovacuum for 8.1, which will be a leap forward.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)

#39Matthew T. O'Connor
matthew@zeut.net
In reply to: Alvaro Herrera (#37)
Re: Autovacuum in the backend

Alvaro Herrera wrote:

A question for interested parties. I'm thinking in handling the
user/password issue by reading the flat files (the copies of pg_shadow,
pg_database, etc).

The only thing that I'd need to modify is add the datdba field to
pg_database, so we can figure out an appropiate user for vacuuming each
database.

What do people think?

I probably don't understand all the issue involved here but reading
pg_shadow by hand seems problematic. Do you constantly re-read it?
What happens when a new user is added etc....

Can't autovacuum run as a super-user that can vacuum anything?

#40Russell Smith
mr-russ@pws.com.au
In reply to: Alvaro Herrera (#38)
Re: Autovacuum in the backend

On Thu, 16 Jun 2005 12:54 pm, Alvaro Herrera wrote:

On Thu, Jun 16, 2005 at 11:07:31AM +1000, Gavin Sherry wrote:

On Wed, 15 Jun 2005, Bruce Momjian wrote:

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

I'm wondering if effort is being misdirected here. I remember when Mark
Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
significant performance loss -- I think on the order of 30% to 40% (I will
try and dig up a link to the results).

I think those are orthogonal issues. One is fixing whatever performance
issues there are because of VACUUM. Note that the fact that Mark was
having such a drop in performance with autovacuum does only mean that
at the enormous load under which the OSDL tests are run, autovacuum is
not the best solution. Not everybody runs with that sort of load
anyway. (In fact lots of people don't.)

I agree here. There have been a couple of patches for improvements to VACUUM rejected in the past.
EG, partial vacuum. If we have autovacuum in the backend, it doesn't matter about people
vacuuming the wrong part of a file. The system will manage it.

I'd also say there are a much greater number of people who will be able to fiddle with
an implemented autovac to improve its performance and load. However I think there
are less people who can complete what Alvaro is doing.

So, one issue is that at high loads, there are improvements to be made
to VACUUM. The other issue is to get VACUUM to run in the first place,
which is what autovacuum addresses.

There are plenty of ideas to shoot around here. Like
- only run one iteration of a vacuum so you only clean indexes once, then stop the vacuum till the next cycle.
- Create the dead space man stuff with the bgwriter
- Make sure you have individual table analyze and vacuum stats so vacuum can be flexible to different tables.

Some of the autovac issues we have seen recently like O(n^2) with tables will go away by being in the backend.
So not everything will perform the same after the integration.

I can easily predict that we will make adjustments and improvements to
VACUUM in the future, but I'm not so sure if it will happen before 8.1
feature-freezes. I have more confidence that we can integrate
autovacuum for 8.1, which will be a leap forward.

The big leap will be to get it in the backend, which will as Chris KL suggested stop people shooting themselves in the foot.
I'm not sure what strict rules are imposed by Feature freeze, but there may be time for others to make some improvements before 8.1.
We have also looked at this for at least 2 releases now. If it doesn't get in now, it will just get in for 8.2 and no improvements till 8.2.

Regards

Russell Smith

#41Josh Berkus
josh@agliodbs.com
In reply to: Gavin Sherry (#34)
Re: Autovacuum in the backend

Gavin, People,

I'm wondering if effort is being misdirected here. I remember when Mark
Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
significant performance loss -- I think on the order of 30% to 40% (I will
try and dig up a link to the results).

It wasn't quite that bad, and the automated DBT2 is deceptive; the test
doesn't run for long enough for *not* vacuuming to be a problem. For a real
test, you'd need to do a 24-hour, or 48-hour DBT2 run.

Not that I don't agree that we need a less I/O intense alternative to VACUUM,
but it seems unlikely that we could actually do this, or even agree on a
spec, before feature freeze. Wheras integrated AV is something we *could*
do, and is widely desired.

If we do integrated AV, it should only be turned on by default at a relatively
low level. And wasn't there an issue on Windows with AV not working?

--
Josh Berkus
Aglio Database Solutions
San Francisco

#42Alvaro Herrera
alvherre@surnet.cl
In reply to: Matthew T. O'Connor (#39)
Re: Autovacuum in the backend

On Wed, Jun 15, 2005 at 11:42:17PM -0400, Matthew T. O'Connor wrote:

Alvaro Herrera wrote:

A question for interested parties. I'm thinking in handling the
user/password issue by reading the flat files (the copies of pg_shadow,
pg_database, etc).

The only thing that I'd need to modify is add the datdba field to
pg_database, so we can figure out an appropiate user for vacuuming each
database.

I probably don't understand all the issue involved here but reading
pg_shadow by hand seems problematic. Do you constantly re-read it?
What happens when a new user is added etc....

You don't read the pg_shadow table. Rather, you read the pg_user file,
which is a plain-text file representing the information in pg_shadow.
It's kept up to date by backends that modify user information. Likewise
for pg_database and pg_group.

Can't autovacuum run as a super-user that can vacuum anything?

That'd be another way to do it, maybe simpler.

Currently I'm working on separating this in two parts though, one being
a shlib and other the standard postmaster-launched backend process. So
I don't have to address this issue right now. It just bothered me to
need a separate file with username and password, and the corresponding
code to read it.

One issue I do have to deal with right now is how many autovacuum
processes do we want to be running. The current approach is to have one
autovacuum process. Two possible options would be to have one per
database, and one per tablespace. What do people think?

I'm leaning for the simpler option myself but I'd like to hear more
opinions. Particularly since one-per-database makes the code a lot
simpler as far as I can see, because the shlib only needs to worry about
issuing VACUUM commands; with the other approaches, the shlib has to
manage everything (keep the pg_autovacuum table up to date, figuring out
when vacuums are needed, etc.)

The main problem with the one-per-database is that we wouldn't have a
(simple) way of coordinating vacuums so that they don't compete for I/O.
That's why I thought of the one-per-tablespace approach, though that one
is the most complex of all.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Un poeta es un mundo encerrado en un hombre" (Victor Hugo)

#43Gavin Sherry
swm@linuxworld.com.au
In reply to: Alvaro Herrera (#38)
Re: Autovacuum in the backend

On Wed, 15 Jun 2005, Alvaro Herrera wrote:

On Thu, Jun 16, 2005 at 11:07:31AM +1000, Gavin Sherry wrote:

On Wed, 15 Jun 2005, Bruce Momjian wrote:

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

I'm wondering if effort is being misdirected here. I remember when Mark
Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
significant performance loss -- I think on the order of 30% to 40% (I will
try and dig up a link to the results).

I think those are orthogonal issues. One is fixing whatever performance
issues there are because of VACUUM. Note that the fact that Mark was
having such a drop in performance with autovacuum does only mean that
at the enormous load under which the OSDL tests are run, autovacuum is
not the best solution. Not everybody runs with that sort of load
anyway. (In fact lots of people don't.)

I agree.

So, one issue is that at high loads, there are improvements to be made
to VACUUM. The other issue is to get VACUUM to run in the first place,
which is what autovacuum addresses.

I can easily predict that we will make adjustments and improvements to
VACUUM in the future, but I'm not so sure if it will happen before 8.1
feature-freezes. I have more confidence that we can integrate
autovacuum for 8.1, which will be a leap forward.

I guess my main concern is that we'll have a solution to the problem of
dead tuples which is only half way there. It is only an incremental
improvement upon the contrib module and solves only one real problem:
users do not read up on VACUUM or autovacuum. This is at the expense of
making it appear to be suitable for the general user base when it isn't,
in my opinion. That isn't the fault of autovacuum but is a function of the
cost of ordinary vacuum.

Thanks,

Gavin

#44Joshua D. Drake
jd@commandprompt.com
In reply to: Russell Smith (#40)
Re: Autovacuum in the backend

The big leap will be to get it in the backend, which will as Chris KL suggested stop people shooting themselves in the foot.
I'm not sure what strict rules are imposed by Feature freeze, but there may be time for others to make some improvements before 8.1.
We have also looked at this for at least 2 releases now. If it doesn't get in now, it will just get in for 8.2 and no improvements till 8.2.

Just my own two cents. First I am not knocking the work that has been on
autovacuum. I am sure that it was a leap on its own to get it to work.
However I will say that I just don't see the reason for it.

Vacuum especially in the 8.x series isn't that bad. Heck if you actually
manage your catalog even on large databases it can be reasonable. Yes
it takes a little **gasp** administrative maintenance to run vacuum at
just that right time, on just those right tables but...

Anyway -- it seems it may be beneficial to focus the efforts somewhere
else. The only reason I wanted to know if it was going to be in the
backend last week was because I needed to know if I was going to have
to document in the new book.

Sincerely,

Joshua D. Drake

Show quoted text

Regards

Russell Smith

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

#45Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#42)
Re: Autovacuum in the backend

Alvaro,

One issue I do have to deal with right now is how many autovacuum
processes do we want to be running.  The current approach is to have one
autovacuum process.  Two possible options would be to have one per
database, and one per tablespace.  What do people think?

I'd vote for one, period, for the cluster, if you can manage that. Let's
stick to simple for now. Most users have their database on a single disk or
array, so multiple concurrent vacuums will compete for I/O regardless of
different databases.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#46Josh Berkus
josh@agliodbs.com
In reply to: Joshua D. Drake (#44)
Re: Autovacuum in the backend

Josh,

Just my own two cents. First I am not knocking the work that has been on
autovacuum. I am sure that it was a leap on its own to get it to work.
However I will say that I just don't see the reason for it.

I've personally seen at least a dozen user requests for "autovacuum in the
backend", and had this conversation about 1,100 times:

NB: "After a week, my database got really slow."
Me: "How often are you running VACUUM ANALYZE?"
NB: "Running what?"

--
Josh Berkus
Aglio Database Solutions
San Francisco

#47Neil Conway
neilc@samurai.com
In reply to: Josh Berkus (#41)
Re: Autovacuum in the backend

Josh Berkus wrote:

Not that I don't agree that we need a less I/O intense alternative to VACUUM,
but it seems unlikely that we could actually do this, or even agree on a
spec, before feature freeze.

I don't see the need to rush anything in before the feature freeze.

Wheras integrated AV is something we *could* do, and is widely desired.

I don't see why. IMHO the current autovacuum approach is far from
optimal. If "integrated autovacuum" just means taking the same approach
and building it into the backend, how does that significantly improve
matters? (I find it difficult to take seriously answers like "it lets us
use the backend's hash table implementation"). It _does_ mean there is
more of an implicit stamp of PGDG approval for pg_autovacuum, which is
something I personally wouldn't want to give to the current design.

-Neil

#48Neil Conway
neilc@samurai.com
In reply to: Alvaro Herrera (#42)
Re: Autovacuum in the backend

Alvaro Herrera wrote:

One issue I do have to deal with right now is how many autovacuum
processes do we want to be running. The current approach is to have one
autovacuum process. Two possible options would be to have one per
database, and one per tablespace. What do people think?

Why do we need more than one pg_autovacuum process? (Note that this need
not necessarily imply only one concurrent VACUUM, as you can use
non-blocking connections in libpq.)

-Neil

#49Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Josh Berkus (#46)
Re: Autovacuum in the backend

I've personally seen at least a dozen user requests for "autovacuum in the
backend", and had this conversation about 1,100 times:

NB: "After a week, my database got really slow."
Me: "How often are you running VACUUM ANALYZE?"
NB: "Running what?"

Me too. Just hang out in #postgresql for a while :)

#50Alvaro Herrera
alvherre@surnet.cl
In reply to: Neil Conway (#48)
Re: Autovacuum in the backend

On Thu, Jun 16, 2005 at 02:09:47PM +1000, Neil Conway wrote:

Alvaro Herrera wrote:

One issue I do have to deal with right now is how many autovacuum
processes do we want to be running. The current approach is to have one
autovacuum process. Two possible options would be to have one per
database, and one per tablespace. What do people think?

Why do we need more than one pg_autovacuum process?

The only reason I considered it is because you can use the regular
catalog-management routines to handle the new pg_autovacuum system
catalog. With a single process, we need to issue SQL queries. This is
very ugly IMHO.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Tiene valor aquel que admite que es un cobarde" (Fernandel)

#51Matthew T. O'Connor
matthew@zeut.net
In reply to: Joshua D. Drake (#44)
Re: Autovacuum in the backend

Joshua D. Drake wrote:

Just my own two cents. First I am not knocking the work that has been
on autovacuum. I am sure that it was a leap on its own to get it to
work. However I will say that I just don't see the reason for it.

The major reasons for autovacuum as I see it are as follows:

* Reduces administrative overhead having to keep track of what tables
need to be vacuumed how often.
* Reduces the total amount of time the system spends vacuuming since it
only vacuums when needed.
* Keeps stats up-to-date automatically
* Eliminates newbie confusion
* Eliminates one of the criticisms that the public has against
PostgreSQL (justifed or not)

Also, as VACUUM improves, autovacuum will improve with it.

#52Matthew T. O'Connor
matthew@zeut.net
In reply to: Gavin Sherry (#43)
Re: Autovacuum in the backend

Gavin Sherry wrote:

I guess my main concern is that we'll have a solution to the problem of
dead tuples which is only half way there. It is only an incremental
improvement upon the contrib module and solves only one real problem:
users do not read up on VACUUM or autovacuum. This is at the expense of
making it appear to be suitable for the general user base when it isn't,
in my opinion. That isn't the fault of autovacuum but is a function of the
cost of ordinary vacuum.

Would you mind expounding on why you think autovacuum isn't suitable for
the general public? I know it's not a silver bullet, but I think in
general, it will be helpful for most people.

#53Matthew T. O'Connor
matthew@zeut.net
In reply to: Neil Conway (#47)
Re: Autovacuum in the backend

Neil Conway wrote:

Josh Berkus wrote:

Wheras integrated AV is something we *could* do, and is widely

desired.

I don't see why. IMHO the current autovacuum approach is far from
optimal. If "integrated autovacuum" just means taking the same
approach and building it into the backend, how does that significantly
improve matters? (I find it difficult to take seriously answers like
"it lets us use the backend's hash table implementation"). It _does_
mean there is more of an implicit stamp of PGDG approval for
pg_autovacuum, which is something I personally wouldn't want to give
to the current design.

The reason to integrate it has nothing to do with the hash
implementation, it has to do making autovacuum more accecable to the
masses, and more importantly, it proves a solution (not necerraily the
best solution) to the vacuum problem, which I belive is a problem for
PostgreSQL. Integrating it into the backen also allows autovacuum to be
better than it is now, using the backend logging functions, storing per
table thresholds, solving the O(n2) problem, start up and shutdown
issues and more. I agree that if autovacuum becomes a long term
solution then we should also integrate FSM information etc...

What else is lacking in the current design? Or more specifically what
else would have to be done before you would consider giving it the PGDG
stamp of approval?

Matthew

#54Matthew T. O'Connor
matthew@zeut.net
In reply to: Alvaro Herrera (#50)
Re: Autovacuum in the backend

Alvaro Herrera wrote:

On Thu, Jun 16, 2005 at 02:09:47PM +1000, Neil Conway wrote:

Alvaro Herrera wrote:

One issue I do have to deal with right now is how many autovacuum
processes do we want to be running. The current approach is to have one
autovacuum process. Two possible options would be to have one per
database, and one per tablespace. What do people think?

Why do we need more than one pg_autovacuum process?

The only reason I considered it is because you can use the regular
catalog-management routines to handle the new pg_autovacuum system
catalog. With a single process, we need to issue SQL queries. This is
very ugly IMHO.

It was always my intention to have VACUUM and ANALYZE update the new
autovacuum system table, I just never got around to making that happen.

Personally I would vote for simplicty for now, that is only one
autovacuum process and allow it to only issue one VACUUM command at any
given time. Something more complicated sounds to me like a 2nd
generation optimisation.

#55Matthew T. O'Connor
matthew@zeut.net
In reply to: Neil Conway (#48)
Re: Autovacuum in the backend

Neil Conway wrote:

Alvaro Herrera wrote:

One issue I do have to deal with right now is how many autovacuum
processes do we want to be running. The current approach is to have one
autovacuum process. Two possible options would be to have one per
database, and one per tablespace. What do people think?

Why do we need more than one pg_autovacuum process? (Note that this
need not necessarily imply only one concurrent VACUUM, as you can use
non-blocking connections in libpq.)

Part of the backend integration work Alvaro is doing is teaching
autovacuum to do it's work without libpq.

#56Rod Taylor
pg@rbt.ca
In reply to: Matthew T. O'Connor (#51)
Re: Autovacuum in the backend

On Thu, 2005-06-16 at 00:44 -0400, Matthew T. O'Connor wrote:

Joshua D. Drake wrote:

Just my own two cents. First I am not knocking the work that has been
on autovacuum. I am sure that it was a leap on its own to get it to
work. However I will say that I just don't see the reason for it.

* Eliminates newbie confusion

Ignore everything else. This one is the clincher.

Someone doing serious database work is going to read the docs to find
out about backup / restore processes and basic tuning. They'll run
across the disable switch for autovacuum soon enough.

The jack of all trades IT guy who is running some minor work but doesn't
know much about databases in general won't have as many hurdles to
climb.

Besides, vacuum off by default possibly makes for huge files and takes
forever to reclaim space (cluster, vacuum full, etc.). Vacuum on by
default means worst case they turn it off and instantly their IO load
decreases.
--

#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#37)
Re: Autovacuum in the backend

Alvaro Herrera <alvherre@surnet.cl> writes:

A question for interested parties. I'm thinking in handling the
user/password issue by reading the flat files (the copies of pg_shadow,
pg_database, etc).

Er, what "user/password issue"? Context please.

The only thing that I'd need to modify is add the datdba field to
pg_database, so we can figure out an appropiate user for vacuuming each
database.

The datdba is not necessarily a superuser, and therefore is absolutely
not the right answer for any question related to autovacuum.  But in
any case, I would expect that an integrated-into-the-backend autovac
implementation would be operating at a level below any permission checks
--- so this question shouldn't be relevant anyway.

regards, tom lane

#58Gavin Sherry
swm@linuxworld.com.au
In reply to: Matthew T. O'Connor (#52)
Re: Autovacuum in the backend

On Thu, 16 Jun 2005, Matthew T. O'Connor wrote:

Gavin Sherry wrote:

I guess my main concern is that we'll have a solution to the problem of
dead tuples which is only half way there. It is only an incremental
improvement upon the contrib module and solves only one real problem:
users do not read up on VACUUM or autovacuum. This is at the expense of
making it appear to be suitable for the general user base when it isn't,
in my opinion. That isn't the fault of autovacuum but is a function of the
cost of ordinary vacuum.

Would you mind expounding on why you think autovacuum isn't suitable for
the general public? I know it's not a silver bullet, but I think in
general, it will be helpful for most people.

As I said, this is largely the fault of VACUUM. The main thing I'd like to
see is a complete solution to the problem. I'm not picking on autovacuum.
However, I will elaborate a little on why I think autovacuum should not
be a feature of the backend:

1) The main argument so far is that autovacuum will ensure that users who
do not read the maintenance section of the manual will not notice a
deterioration of performance. This means that we anticipate autovacuum
being on by default. This suggests that the default autovacuum
configuration will not need tuning. I do not think that will be the case.

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

3) autovacuum on by default means row level stats are on by default. This
will have a non-trivial performance impact on users, IMHO. For right or
wrong, our users take the postgresql.conf defaults pretty seriously and
this level of stats collection could and will remain enabled in some
non-trivial percentage of users who turn autovacuum off (consider many
users' reluctance to change shared_buffers in previous releases). To quote
from the README:

"The overhead of the stats system has been shown to be significant under
certain workloads. For instance, a tight loop of queries performing
"select 1" was found to run nearly 30% slower when row-level stats were
enabled."

I'm not one for "select 1" benchmarks but this is a problem that hasn't
even been mentioned, as far as I recall.

4) Related to this, I guess, is that a user's FSM settings might be
completely inappropriate. The 'Just read the manual' or 'Just read the
logs' argument doesn't cut it, because the main argument for autovacuum in
the backend is that people do not and will not.

5) It doesn't actually shrink tables -- ie, there's no VACUUM FULL. If
we're telling users about VACUUM less often than we are now, there's bound
to be bloating issues (see 4).

I guess the main point is, if something major like this ships in the
backend it says to users that the problem has gone away. pg_autovacuum is
a good contrib style solution: it addresses a problem users have and
attempts to solve it the way other users might try and solve it. When you
consider it in the backend, it looks like a workaround. I think users are
better served by solving the real problem.

Gavin

#59Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Gavin Sherry (#58)
Re: Autovacuum in the backend

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

I completly agree with Gavin - integrating this kind of thing into the
backend writer or integrate it with FSM would be the ideal solution.

I guess everybody who has already vacuumed a 2 TB relation will agree
here. VACUUM is not a problem for small "my cat Minka" databases.
However, it has been a real problem on large, heavy-load databases. I
have even seen people splitting large tables and join them with a view
to avoid long vacuums and long CREATE INDEX operations (i am not joking
- this is serious).

postgresql is more an more used to really large boxes. this is an
increasing problem. gavin's approach using a vacuum bitmap seems to be a
good approach. an alternative would be to have some sort of vacuum queue
containing a set of pages which are reported by the writing process (=
backend writer or backends).

best regards,

hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at

#60Magnus Hagander
mha@sollentuna.net
In reply to: Hans-Jürgen Schönig (#59)
Re: Autovacuum in the backend

If we do integrated AV, it should only be turned on by
default at a relatively low level. And wasn't there an issue
on Windows with AV not working?

AFAIK, it works.
But the fact that you need to provide it with a userid/password combo
makes it a lot harder to install as a service. And it's not installed by
default by the installer, for that reason (and I think maybe others?
Dave, I think you're the one who said no-service-by-default?)

//Magnus

#61Gavin Sherry
swm@linuxworld.com.au
In reply to: Hans-Jürgen Schönig (#59)
Re: Autovacuum in the backend

On Thu, 16 Jun 2005, [ISO-8859-1] Hans-J�rgen Sch�nig wrote:

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

I completly agree with Gavin - integrating this kind of thing into the
backend writer or integrate it with FSM would be the ideal solution.

I guess everybody who has already vacuumed a 2 TB relation will agree
here. VACUUM is not a problem for small "my cat Minka" databases.
However, it has been a real problem on large, heavy-load databases. I
have even seen people splitting large tables and join them with a view
to avoid long vacuums and long CREATE INDEX operations (i am not joking
- this is serious).

I think this gets away from my point a little. People with 2 TB tables can
take care of themselves, as can people who've taken the time to partition
their tables to speed up vacuum. I'm more concerned about the majority of
people who fall in the middle -- between the hobbiest and the high end
data centre.

Thanks,

Gavin

#62Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Gavin Sherry (#61)
Re: Autovacuum in the backend

Gavin Sherry wrote:

On Thu, 16 Jun 2005, [ISO-8859-1] Hans-J�rgen Sch�nig wrote:

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

I completly agree with Gavin - integrating this kind of thing into the
backend writer or integrate it with FSM would be the ideal solution.

I guess everybody who has already vacuumed a 2 TB relation will agree
here. VACUUM is not a problem for small "my cat Minka" databases.
However, it has been a real problem on large, heavy-load databases. I
have even seen people splitting large tables and join them with a view
to avoid long vacuums and long CREATE INDEX operations (i am not joking
- this is serious).

I think this gets away from my point a little. People with 2 TB tables can
take care of themselves, as can people who've taken the time to partition
their tables to speed up vacuum. I'm more concerned about the majority of
people who fall in the middle -- between the hobbiest and the high end
data centre.

Thanks,

Gavin

I think your approach will help all of them.
If we had some sort of autovacuum (which is packages with most distros
anyway - having it in the core is nice as well) and a mechanism to
improve realloaction / vacuum speed we have solved all problems.

i do think that 2 tb can take care of themselves. the question is,
however, whether the database can do what they want ...

thanks a lot,

hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at

#63Dave Page
dpage@vale-housing.co.uk
In reply to: Hans-Jürgen Schönig (#62)
Re: Autovacuum in the backend

-----Original Message-----
From: Magnus Hagander [mailto:mha@sollentuna.net]
Sent: 16 June 2005 10:15
To: Josh Berkus; Gavin Sherry
Cc: Bruce Momjian; pgsql-hackers@postgresql.org; Dave Page
Subject: RE: [HACKERS] Autovacuum in the backend

If we do integrated AV, it should only be turned on by
default at a relatively low level. And wasn't there an issue
on Windows with AV not working?

AFAIK, it works.
But the fact that you need to provide it with a userid/password combo
makes it a lot harder to install as a service. And it's not
installed by
default by the installer, for that reason (and I think maybe others?
Dave, I think you're the one who said no-service-by-default?)

Yes, 'cos there was no easy way to do it securely when I did it without
a lot of mucking about to setup a .pgpass file in the service users
account.

It does work perfectly well however, and did so long before PostgreSQL
itself did :-)

Regards, Dave

#64Andrew Dunstan
andrew@dunslane.net
In reply to: Gavin Sherry (#61)
Re: Autovacuum in the backend

Gavin Sherry said:

On Thu, 16 Jun 2005, [ISO-8859-1] Hans-J�rgen Sch�nig wrote:

2) By no fault of its own, autovacuum's level of granularity is the
table level. For people dealing with non-trivial amounts of data
(and we're not talking gigabytes or terabytes here), this is a
serious drawback. Vacuum at peak times can cause very intense IO
bursts -- even with the enhancements in 8.0. I don't think the
solution to the problem is to give users the impression that it is
solved and then vacuum their tables during peak periods. I cannot
stress this enough.

I completly agree with Gavin - integrating this kind of thing into the
backend writer or integrate it with FSM would be the ideal solution.

I guess everybody who has already vacuumed a 2 TB relation will agree
here. VACUUM is not a problem for small "my cat Minka" databases.
However, it has been a real problem on large, heavy-load databases. I
have even seen people splitting large tables and join them with a view
to avoid long vacuums and long CREATE INDEX operations (i am not
joking - this is serious).

I think this gets away from my point a little. People with 2 TB tables
can take care of themselves, as can people who've taken the time to
partition their tables to speed up vacuum. I'm more concerned about the
majority of people who fall in the middle -- between the hobbiest and
the high end data centre.

My only problemn with what you say is that we should not incorporate AV into
the backend until these things have been solved. This would be one step down
a long raod, and that's how it should be positioned.

I am very concerned that with Feature Freeze 2 weeks away we seem to be in a
similar position to where we were a year ago. I know we don't even promise
anything, but certainly I and others believed that work was being done to
get AV into the backend in 8.1. Not doing this because we think it could be
lots better would not give people a good impression of our processes. I
certainly don't think it will make matters worse, especially if it's not on
by default.

cheers

andrew

#65Christopher Browne
cbbrowne@acm.org
In reply to: Bruce Momjian (#25)
Re: Autovacuum in the backend

swm@linuxworld.com.au (Gavin Sherry) wrote:

I guess the main point is, if something major like this ships in the
backend it says to users that the problem has gone away. pg_autovacuum is
a good contrib style solution: it addresses a problem users have and
attempts to solve it the way other users might try and solve it. When you
consider it in the backend, it looks like a workaround. I think users are
better served by solving the real problem.

Hear, hear!

It seems to me that the point in time at which it is *really*
appropriate to put this into the backend is when the new GUC variable
"dead_tuple_map_size" (akin to FSM) is introduced, and there is a new
sort of 'VACUUM DEAD TUPLES' command which goes through the DTPM (Dead
Tuple Page Map).

In THAT case, there would be the ability to do a VACUUM on the "dead
bits" of the table that consists of 50M rows without having to go
through the 49M rows that haven't been touched in months.
--
"cbbrowne","@","gmail.com"
http://linuxfinances.info/info/languages.html
"I can't escape the sensation that I have already been thinking in
Lisp all my programming career, but forcing the ideas into the
constraints of bad languages, which explode those ideas into a
bewildering array of details, most of which are workarounds for the
language." -- Kaz Kylheku

#66Matthew T. O'Connor
matthew@zeut.net
In reply to: Andrew Dunstan (#64)
Re: Autovacuum in the backend

Andrew Dunstan wrote:

Gavin Sherry said:

I think this gets away from my point a little. People with 2 TB tables
can take care of themselves, as can people who've taken the time to
partition their tables to speed up vacuum. I'm more concerned about the
majority of people who fall in the middle -- between the hobbiest and
the high end data centre.

My only problemn with what you say is that we should not incorporate AV into
the backend until these things have been solved. This would be one step down
a long raod, and that's how it should be positioned.

Right, I think if VACUUM is improved than the semantics of AV in the
backend might change, but I think there will always be a need for some
maintenance, and a daemon that monitors the maintenance needs of your
database and fires off appropriate maintenance commands for you is
good. No it doesn't solve all problems, but I think it solves a lot of
problems for a lot of people. Besides VACUUM isn't the only the AV
does, it also does ANALYZE to keep your stats up-to-date and it watches
for XID wraparound. It could also look for REINDEX opportunities and
who knows what else in the future.

I am very concerned that with Feature Freeze 2 weeks away we seem to be in a
similar position to where we were a year ago. I know we don't even promise
anything, but certainly I and others believed that work was being done to
get AV into the backend in 8.1. Not doing this because we think it could be
lots better would not give people a good impression of our processes. I
certainly don't think it will make matters worse, especially if it's not on
by default.

I agree. Also, some people in this thread have been making noises about
wanting AV on by default. This might be nice, but I am still leaning
towards off by default at least in 8.1.

#67Matthew T. O'Connor
matthew@zeut.net
In reply to: Gavin Sherry (#58)
Re: Autovacuum in the backend

Gavin Sherry wrote:

On Thu, 16 Jun 2005, Matthew T. O'Connor wrote:

Would you mind expounding on why you think autovacuum isn't suitable for
the general public? I know it's not a silver bullet, but I think in
general, it will be helpful for most people.

As I said, this is largely the fault of VACUUM. The main thing I'd like to
see is a complete solution to the problem. I'm not picking on autovacuum.
However, I will elaborate a little on why I think autovacuum should not
be a feature of the backend:

Don't worry, I don't think you are picking on AV.

1) The main argument so far is that autovacuum will ensure that users who
do not read the maintenance section of the manual will not notice a
deterioration of performance. This means that we anticipate autovacuum
being on by default. This suggests that the default autovacuum
configuration will not need tuning. I do not think that will be the case.

I disagree with this. I think the newbie protection benefits of AV are
not it's primary goal, though I do think it's an important one. The
main thing AV brings is the ability to control bloating in your database
and keep your stats up-to-date no matter what your work load. It is
possible for an Admin to setup cron scripts to run VACUUM or ANALYZE on
particularly needy tables at appropriate intervals, but I guarantee that
the cron script is going to either fire too many, or too few VACUUMS.
Also when the workload changes, or a new table is added, the Admin then
needs to update his cron scripts. This all goes away with AV and I
believe this is a much bigger goal than the newbie problem.

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

I agree this is a major problem with VACUUM, but I also think it's a
different problem. One advantage of integrated AV is that you will be
able to set per-table thresholds, which include the ability to turn off
AV for any given table. If you are running a database with tables this
big, I think you will be able to figure out how to customize integrated
AV to your needs.

3) autovacuum on by default means row level stats are on by default. This
will have a non-trivial performance impact on users, IMHO. For right or
wrong, our users take the postgresql.conf defaults pretty seriously and
this level of stats collection could and will remain enabled in some
non-trivial percentage of users who turn autovacuum off (consider many
users' reluctance to change shared_buffers in previous releases). To quote
from the README:

"The overhead of the stats system has been shown to be significant under
certain workloads. For instance, a tight loop of queries performing
"select 1" was found to run nearly 30% slower when row-level stats were
enabled."

I'm not one for "select 1" benchmarks but this is a problem that hasn't
even been mentioned, as far as I recall.

I mentioned this in the README because I thought I should, not because I
think it's a real problem in practice. I think a real production
database doing queries that are any more complicated than "select 1"
will probably not notice the difference.

4) Related to this, I guess, is that a user's FSM settings might be
completely inappropriate. The 'Just read the manual' or 'Just read the
logs' argument doesn't cut it, because the main argument for autovacuum in
the backend is that people do not and will not.

Agreed, it doesn't solve all problems, and I'm not arguing that the
integration of AV makes PostgreSQL newbie safe it just helps reduce the
newbie problem. Again if the default FSM settings are inappropriate
for a database then the user is probably doing something more
complicated that a "my cat minka" database and will need to learn some
tuning skills anyway.

5) It doesn't actually shrink tables -- ie, there's no VACUUM FULL. If
we're telling users about VACUUM less often than we are now, there's bound
to be bloating issues (see 4).

Not totally true, regular VACUUM can shrink tables a little (I think
only if there is free space at the end of the table it can cutoff
without moving data around). But if AV is on and the settings are
reasonable, then a table shouldn't bloat much or at all. Also, I don't
think we are telling people to VACUUM less, in fact tables that need it
will usually get VACUUM'd more, we are just telling the users that if
they turn AV on, they don't have to manage all the VACUUMing.

I guess the main point is, if something major like this ships in the
backend it says to users that the problem has gone away. pg_autovacuum is
a good contrib style solution: it addresses a problem users have and
attempts to solve it the way other users might try and solve it. When you
consider it in the backend, it looks like a workaround. I think users are
better served by solving the real problem.

Which problem goes away? The problem of users forgetting to VACUUM does
go away, the problem of the VACUUM command being problematic on large
tables doesn't but that is a different question.

My basic position is that with integrated AV, there will always (or at
least for the foreseeable future) be some maintenance that users will
need to do to their databases by hand (or by cron) and that AV does this
better than cron does. When VACUUM is improved, the semantics of AV
might change, but the maintenance work will still need to be done.

Matt

#68Matthew T. O'Connor
matthew@zeut.net
In reply to: Hans-Jürgen Schönig (#59)
Re: Autovacuum in the backend

Hans-J�rgen Sch�nig wrote:

I completly agree with Gavin - integrating this kind of thing into the
backend writer or integrate it with FSM would be the ideal solution.

Yes AV should look at FSM data, and it will eventually. I'm not sure
how you would integrate AV with the backend writer, but again if
improvements are made to vacuum, AV might have to change along with it,
but I still think it will be needed or at least helpful.

#69Steve Atkins
steve@blighty.com
In reply to: Josh Berkus (#46)
Re: Autovacuum in the backend

On Wed, Jun 15, 2005 at 09:07:30PM -0700, Josh Berkus wrote:

Josh,

Just my own two cents. First I am not knocking the work that has been on
autovacuum. I am sure that it was a leap on its own to get it to work.
However I will say that I just don't see the reason for it.

I've personally seen at least a dozen user requests for "autovacuum in the
backend", and had this conversation about 1,100 times:

NB: "After a week, my database got really slow."
Me: "How often are you running VACUUM ANALYZE?"
NB: "Running what?"

Yes, me too.

I always understood autovacuum to be a way to avoid having newbies get
burned by not vacuuming, and for simplifying the maintenance of lower
traffic databases.

I don't see people with high-traffic databases (relative to the hardware
they're running on) ever using autovacuum with the current state of
vacuum and autovacuum.

If improvements to vacuum (unrelated to autovacuum) reduce the IO load
that would be a great thing, especially for those of us dealing with
24x7 databases. (I really like the dirty bitmap suggestion - it sounds
a clean way to reduce the amount of work needed). If autovacuum were
extended to allow more flexible scheduling (or even to be aware of the
other IO going on) then it would be of wider use - but I think the real
value of autovacuum is to make sure that new users (Windows...) don't
have a bad experience when they first try PG.

Cheers,
Steve

#70Andreas Pflug
pgadmin@pse-consulting.de
In reply to: Matthew T. O'Connor (#66)
Re: Autovacuum in the backend

Matthew T. O'Connor wrote:

Right, I think if VACUUM is improved than the semantics of AV in the
backend might change, but I think there will always be a need for some
maintenance, and a daemon that monitors the maintenance needs of your
database and fires off appropriate maintenance commands for you is
good. No it doesn't solve all problems, but I think it solves a lot of
problems for a lot of people. Besides VACUUM isn't the only the AV
does, it also does ANALYZE to keep your stats up-to-date and it watches
for XID wraparound. It could also look for REINDEX opportunities and
who knows what else in the future.

Dave,

i wonder if we should aim to have pgAgent in the backend which was one
of the reasons why I considered to have it converted from C++ to pure C.

There are many regular maintenance issues that AV can cover, some more
it could cover and many more we can't even think of right now. Having an
sql executing agent freely at hand (consistent on _every_ platform,
without cron/anacron/at/younameit dependencies) should be helpful for that.

Regards,
Andreas

#71Alvaro Herrera
alvherre@surnet.cl
In reply to: Tom Lane (#57)
Re: Autovacuum in the backend

On Thu, Jun 16, 2005 at 01:32:16AM -0400, Tom Lane wrote:

Alvaro Herrera <alvherre@surnet.cl> writes:

A question for interested parties. I'm thinking in handling the
user/password issue by reading the flat files (the copies of pg_shadow,
pg_database, etc).

Er, what "user/password issue"? Context please.

The only thing that I'd need to modify is add the datdba field to
pg_database, so we can figure out an appropiate user for vacuuming each
database.

The datdba is not necessarily a superuser, and therefore is absolutely
not the right answer for any question related to autovacuum.  But in
any case, I would expect that an integrated-into-the-backend autovac
implementation would be operating at a level below any permission checks
--- so this question shouldn't be relevant anyway.

Ok, seems things are quite a bit out of context. What I did was take
Matthew's patch for integrating contrib pg_autovacuum into the
postmaster. This patch was posted several times as of July and August
2004. This patch had several issues, like an incorrect shutdown
sequence, forcing libpq to be statically linked into the backend, not
correctly using ereport(), not using the backend's memory management
infrastructure.

There were several suggestions. One was to separate it in two parts,
one which would be a process launched by postmaster, and another which
would be a shared library, loaded by that other process, which would in
turn load libpq and issue SQL queries (including but not limited to
VACUUM and ANALYZE queries) to a regular backend, using a regular
connection.

Now, the user/password issue is which user and password combination is
used to connect to the regular backend. Matthew had created a password
file, to be used in a similar fashion to libpq's password file. This
works but has the drawback that the user has to set the file correctly.
What I'm proposing is using the flatfiles for this.

Now, I'm hearing people don't like using libpq. This means the whole
thing turn a lot more complicated; for one thing, because it will need
to "connect" to every database in some fashion. Also, you want it to
"skip" normal permission checks, which would be doable only if it's not
using libpq. On the other hand, if there were multiple autovacuum
processes, one per database, it'd be all much easier, without using
libpq.

Could we clarify what scenario is people envisioning? I don't want to
waste time fixing code that in the end is going to be declared as
fundamentally flawed -- I'd rather work on shared dependencies.

Some people say "keep it simple and have one process per cluster." I
think they don't realize it's actually more complex, not the other way
around. The only additional complexity is how to handle concurrent
vacuuming, but the code turns out to be simpler because we have access
to system catalogs and standard backend infrastructure in a simple
fashion.

A wholly separate approach is what should the autovacuum daemon be
doing. At present we only have "full vacuum", "vacuum" and "analyze".
In the future this can be extended and autovacuum can launch partial
vacuums, nappy vacuums, bitmapped vacuums, coffee-with-cream vacuums.
But we need to start somewhere.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"�Qu� importan los a�os? Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo" (Mafalda)

#72Alvaro Herrera
alvherre@surnet.cl
In reply to: Gavin Sherry (#58)
Re: Autovacuum in the backend

On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

People running systems with petabyte-sized tables can disable autovacuum
for those tables, and leave it running for the rest. Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"�C�mo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germ�n Poo)

#73Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Alvaro Herrera (#72)
Re: Autovacuum in the backend

Alvaro Herrera wrote:

On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

People running systems with petabyte-sized tables can disable autovacuum
for those tables, and leave it running for the rest. Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.

professional advice won't help you here because you still have to vacuum
this giant table. this is especially critical in case of 24x7 systems
(which are quite frequent). in many cases there is no maintenance window
anymore (e.g. a wastewater system will be only 24x7).

reducing the impact of vacuum and "create index" would be important to
many people. to me improving vacuum it is as important as Jan's bgwriter
patch (it reduces the troubles people had with checkpoints).

best regards,

hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at

#74Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#71)
Re: Autovacuum in the backend

Alvaro Herrera <alvherre@surnet.cl> writes:

Now, I'm hearing people don't like using libpq.

Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process. About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.

Some people say "keep it simple and have one process per cluster." I
think they don't realize it's actually more complex, not the other way
around.

Agreed. If you aren't connected to a specific database, then you cannot
use any of the normal backend infrastructure for catalog access, which
is pretty much a killer limitation.

A simple approach would be a persistent autovac background process for
each database, but I don't think that's likely to be acceptable because
of the amount of resources tied up (PGPROC slots, open files, etc).

One thing that might work is to have the postmaster spawn an autovac
process every so often. The first thing the autovac child does is pick
up the current statistics dump file (which it can find without being
connected to any particular database). It looks through that to
determine which database is most in need of work, then connects to that
database and does some "reasonable" amount of work there, and finally
quits. Awhile later the postmaster spawns another autovac process that
can connect to a different database and do work there.

This design would mean that the autovac process could not have any
long-term state of its own: any long-term state would have to be in
either system catalogs or the statistics. But I don't see that as
a bad thing really --- exposing the state will be helpful from a
debugging and administrative standpoint.

regards, tom lane

#75Dave Page
dpage@vale-housing.co.uk
In reply to: Tom Lane (#74)
Re: Autovacuum in the backend

-----Original Message-----
From: Andreas Pflug [mailto:pgadmin@pse-consulting.de]
Sent: 16 June 2005 15:14
To: Dave Page
Cc: Matthew T. O'Connor; Andrew Dunstan;
swm@linuxworld.com.au; postgres@cybertec.at;
alvherre@surnet.cl; pgman@candle.pha.pa.us;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Autovacuum in the backend

Dave,

i wonder if we should aim to have pgAgent in the backend
which was one
of the reasons why I considered to have it converted from C++
to pure C.

In previous discussions on -hackers when ppl raised the idea of
something like pgAgent being built into the backend, istm that the
majority of people were against the idea.

Regards, Dave.

#76Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#71)
Re: Autovacuum in the backend

Alvaro,

coffee-with-cream vacuums.

I tried this and now my Hoover makes this horrible noise and smokes. ;-)

All:

Seriously, all: when I said that "users" were asking for Autovac in the
backend (AVitB), I wasn't talking just the newbies on #postgresql. I'm also
talking companies like Hyperic, and whole groups like the postgresql.org.br.
This is a feature that people want, and unless there's something
fundamentally unstable about it, it seems really stupid to hold it back
because we're planning VACUUM improvements for 8.2.

AVitB has been on the TODO list for 2 versions. There's been 2 years to
question its position there. Now people are bringing up objections when
there's no time for discussion left? This stinks.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#77Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#76)
Re: Autovacuum in the backend

People,

AVitB has been on the TODO list for 2 versions.   There's been 2 years to
question its position there.   Now people are bringing up objections when
there's no time for discussion left?  This stinks.

Hmmm ... to be specific, I'm referring to the objections to the *idea* of
AVitB, not the problems with the current patch.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#78Josh Berkus
josh@agliodbs.com
In reply to: Dave Page (#75)
Re: Autovacuum in the backend

Dave,

In previous discussions on -hackers when ppl raised the idea of
something like pgAgent being built into the backend, istm that the
majority of people were against the idea.

Well, you're up against the minimalist approach to core PostgreSQL there. It
would pretty much *have* to be an optional add-in, even if it was stored in
pg_catalog. I can see a lot of uses for a back-end job scheduler myself, but
it would need to go through the gauntlet of design criticism first <wry
grin>.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#79Douglas McNaught
doug@mcnaught.org
In reply to: Josh Berkus (#76)
Re: Autovacuum in the backend

Josh Berkus <josh@agliodbs.com> writes:

Seriously, all: when I said that "users" were asking for Autovac in the
backend (AVitB), I wasn't talking just the newbies on #postgresql. I'm also
talking companies like Hyperic, and whole groups like the postgresql.org.br.
This is a feature that people want, and unless there's something
fundamentally unstable about it, it seems really stupid to hold it back
because we're planning VACUUM improvements for 8.2.

Agreed, and I don't see AVitB as standing in the way of any of those
proposed improvements--it's just that AVitB has a chance of making it
into 8.1, and none of the proposed improvements do. I don't see why
people are objecting.

Also, count me in the "turn it on by default" crowd--I'd rather not
have newbies see unending file bloat from normal usage, it just looks
bad. Anyone who plans to deploy for large databases and high loads
needs to learn to tune (just as with any other database) and can make
an informed decision about whether AV should be on or not.

-Doug

#80Matthew T. O'Connor
matthew@zeut.net
In reply to: Tom Lane (#74)
Re: Autovacuum in the backend

Tom Lane wrote:

Alvaro Herrera <alvherre@surnet.cl> writes:

Now, I'm hearing people don't like using libpq.

Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process. About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.

Yes libpq has to go, I thought this was clear, but perhaps I didn't say
it clearly enough. Anyway, this was the stumbling block which prevented
me from making more progress on autovacuum integration.

Some people say "keep it simple and have one process per cluster." I
think they don't realize it's actually more complex, not the other way
around.

A simple approach would be a persistent autovac background process for
each database, but I don't think that's likely to be acceptable because
of the amount of resources tied up (PGPROC slots, open files, etc).

Agreed, this seems ugly.

One thing that might work is to have the postmaster spawn an autovac
process every so often. The first thing the autovac child does is pick
up the current statistics dump file (which it can find without being
connected to any particular database). It looks through that to
determine which database is most in need of work, then connects to that
database and does some "reasonable" amount of work there, and finally
quits. Awhile later the postmaster spawns another autovac process that
can connect to a different database and do work there.

I don't think you can use a dump to determine who should be connected to
next since you don't really know what happened since the last time you
exited. What was a priority 5 or 10 minutes ago might not be a priority
now.

This design would mean that the autovac process could not have any
long-term state of its own: any long-term state would have to be in
either system catalogs or the statistics. But I don't see that as
a bad thing really --- exposing the state will be helpful from a
debugging and administrative standpoint.

This is not a problem as my patch, that Alvaro has now taken over,
already created a new system catalog for all autovac data, so autovac
really doesn't contain any static persistent data.

The rough design I had in mind was:
1) On startup postmaster spawns the master autovacuum process
2) The master autovacuum process spawns backends to do the vacuuming
work on a particular database
3) The master autovacuum waits for this process to exit, then spaws the
next backend for the next database
4) Repeat this loop until all databases in the cluster have been
checked, then sleep for a while, and start over again.

I'm not sure if this is feasible, or if this special master autovacuum
process would be able to fork off or request that the postmaster fork
off an autovacuum process for a particular database in the cluster.
Thoughts or comments?

Matthew

#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew T. O'Connor (#80)
Re: Autovacuum in the backend

"Matthew T. O'Connor" <matthew@zeut.net> writes:

I don't think you can use a dump to determine who should be connected to
next since you don't really know what happened since the last time you
exited. What was a priority 5 or 10 minutes ago might not be a priority
now.

Well, the information necessary to make that decision has to be
available from the statistics file. This doesn't seem like an
insuperable problem.

The rough design I had in mind was:
1) On startup postmaster spawns the master autovacuum process
2) The master autovacuum process spawns backends to do the vacuuming
work on a particular database
3) The master autovacuum waits for this process to exit, then spaws the
next backend for the next database
4) Repeat this loop until all databases in the cluster have been
checked, then sleep for a while, and start over again.

This is unworkable, I believe, because backends have to be direct
children of the postmaster. I don't recall the details at the moment
but there are IPC signaling reasons for it.

I'm not sure if this is feasible, or if this special master autovacuum
process would be able to fork off or request that the postmaster fork
off an autovacuum process for a particular database in the cluster.
Thoughts or comments?

It's possible that we could add some signaling whereby the autovac
master could request the postmaster to fork a child into a particular
database. I'm not sure why this is a lot better than keeping the
stats out where everyone can see them...

regards, tom lane

#82Matthew T. O'Connor
matthew@zeut.net
In reply to: Tom Lane (#81)
Re: Autovacuum in the backend

Tom Lane wrote:

"Matthew T. O'Connor" <matthew@zeut.net> writes:

I don't think you can use a dump to determine who should be connected to
next since you don't really know what happened since the last time you
exited. What was a priority 5 or 10 minutes ago might not be a priority
now.

Well, the information necessary to make that decision has to be
available from the statistics file. This doesn't seem like an
insuperable problem.

Interesting, so the postmaster would kick off an autovacuum process,
which would read in data from the stats system by hand ( it can do this
because the stat system writes it's data to flat files?). I don't know
how complicated this might be but perhaps a simpler method is to just
have each autovacuum process write a file for itself noting what
database it should connect to next. This would work find assuming we
want to continue to loop through all the databases in much the same
fashion as pg_autovacuum currently does.

The rough design I had in mind was:
1) On startup postmaster spawns the master autovacuum process
2) The master autovacuum process spawns backends to do the vacuuming
work on a particular database
3) The master autovacuum waits for this process to exit, then spaws the
next backend for the next database
4) Repeat this loop until all databases in the cluster have been
checked, then sleep for a while, and start over again.

It's possible that we could add some signaling whereby the autovac
master could request the postmaster to fork a child into a particular
database. I'm not sure why this is a lot better than keeping the
stats out where everyone can see them...

Ok.

#83Hannu Krosing
hannu@skype.net
In reply to: Tom Lane (#74)
Re: Autovacuum in the backend

On N, 2005-06-16 at 11:42 -0400, Tom Lane wrote:

Alvaro Herrera <alvherre@surnet.cl> writes:

...

Some people say "keep it simple and have one process per cluster." I
think they don't realize it's actually more complex, not the other way
around.

Agreed. If you aren't connected to a specific database, then you cannot
use any of the normal backend infrastructure for catalog access, which
is pretty much a killer limitation.

A simple approach would be a persistent autovac background process for
each database, but I don't think that's likely to be acceptable because
of the amount of resources tied up (PGPROC slots, open files, etc).

In this case it should also be configurable, which databases will get
their own AV processes.

Also, there is probably no need to keep an AV process running very long
after last "real" backend for that database has closed, as there won't
be any changes anyway.

Having one AV process per DB will likely be a problem for only
installations, where there is very many single-user user-always-
connected databases, which I don't expect to be that many.

And I also expect that soon (after my vacuums-dont-step-on-each-other
patch goes in), there will be need for running several vacuums in
parallel on the same database (say one with non-intrusive vacuum_page
settings for a really large table and several more agressive ones for
fast-changing small tables at the same time), AFAIKS this will also need
several backends - at least one for each parallel vacuum.

One thing that might work is to have the postmaster spawn an autovac
process every so often.

my fastest manual vacuum does its job in 5 sec and is repeated at 10 sec
inervals - will this design be able to match this ?

The first thing the autovac child does is pick
up the current statistics dump file (which it can find without being
connected to any particular database). It looks through that to
determine which database is most in need of work, then connects to that
database and does some "reasonable" amount of work there, and finally
quits. Awhile later the postmaster spawns another autovac process that
can connect to a different database and do work there.

This design would mean that the autovac process could not have any
long-term state of its own: any long-term state would have to be in
either system catalogs or the statistics. But I don't see that as
a bad thing really --- exposing the state will be helpful from a
debugging and administrative standpoint.

--
Hannu Krosing <hannu@skype.net>

#84Andreas Pflug
pgadmin@pse-consulting.de
In reply to: Josh Berkus (#78)
Re: Autovacuum in the backend

Josh Berkus wrote:

Dave,

In previous discussions on -hackers when ppl raised the idea of
something like pgAgent being built into the backend, istm that the
majority of people were against the idea.

Well, you're up against the minimalist approach to core PostgreSQL there. It
would pretty much *have* to be an optional add-in, even if it was stored in
pg_catalog. I can see a lot of uses for a back-end job scheduler myself, but
it would need to go through the gauntlet of design criticism first <wry
grin>.

You want to scare me, don't you? :-)

We're having a growing zoo of daemons that can be regarded as tightly
integrated server add-on processes (slony, autovac, pgAgent), and it
would be really nice (say: win32 users are used to it, thus requiring
it) to have a single point of control.

Maybe a super daemon (in win32 probably pg_ctl), controlling postmaster
and all those helper processes (accessible through pgsql functions, of
course) would be the solition. This keeps the kernel clean, separates
backend shmem from helper processes and enables control over all processes.

Regards,
Andreas

#85Andrew Dunstan
andrew@dunslane.net
In reply to: Andreas Pflug (#84)
Re: Autovacuum in the backend

Andreas Pflug wrote:

We're having a growing zoo of daemons that can be regarded as tightly
integrated server add-on processes (slony, autovac, pgAgent), and it
would be really nice (say: win32 users are used to it, thus requiring
it) to have a single point of control.

Maybe a super daemon (in win32 probably pg_ctl), controlling
postmaster and all those helper processes (accessible through pgsql
functions, of course) would be the solition. This keeps the kernel
clean, separates backend shmem from helper processes and enables
control over all processes.

And this will be ready when? I thought we were discussing what could be
done regarding AVitB between now and feature freeze for 8.1 in about 2
weeks. This surely doesn't come into that category.

cheers

andrew

#86Gavin Sherry
swm@linuxworld.com.au
In reply to: Alvaro Herrera (#72)
Re: Autovacuum in the backend

On Thu, 16 Jun 2005, Alvaro Herrera wrote:

On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

People running systems with petabyte-sized tables can disable autovacuum
for those tables, and leave it running for the rest. Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.

As I've said a few times, I'm not concerned about such users. I'm
concerned about users with some busy tables of a few hundred megabytes. I
still don't think VACUUM at arbitary times on such tables is suitable.

Thanks,

Gavin

#87Andreas Pflug
pgadmin@pse-consulting.de
In reply to: Andrew Dunstan (#85)
Re: Autovacuum in the backend

Andrew Dunstan wrote:

Andreas Pflug wrote:

We're having a growing zoo of daemons that can be regarded as tightly
integrated server add-on processes (slony, autovac, pgAgent), and it
would be really nice (say: win32 users are used to it, thus requiring
it) to have a single point of control.

Maybe a super daemon (in win32 probably pg_ctl), controlling
postmaster and all those helper processes (accessible through pgsql
functions, of course) would be the solition. This keeps the kernel
clean, separates backend shmem from helper processes and enables
control over all processes.

And this will be ready when?

This is certainly 8.2 stuff. I'm sufficiently glad if the
instrumentation stuff that was posted pre-8.0 and left out those days
because committers ran out of time makes it into 8.1...

I thought we were discussing what could be done regarding AVitB
between now and feature freeze for 8.1 in about 2 weeks. This surely
doesn't come into that category.

I agree with former posters that we should have a default on AV to have
a system that performs correct out of the box for smaller installations.
Even a functionally cut-down version of AV running by default that has
to be stopped and replaced by a more sophisticated solution for high
performance installations is better than now.

Regards,
Andreas

#88Tim Allen
tim@proximity.com.au
In reply to: Josh Berkus (#76)
Re: Autovacuum in the backend

Josh Berkus wrote:

Alvaro,

coffee-with-cream vacuums.

I tried this and now my Hoover makes this horrible noise and smokes. ;-)

Probably related to the quality of American coffee ;).

All:

Seriously, all: when I said that "users" were asking for Autovac in the
backend (AVitB), I wasn't talking just the newbies on #postgresql. I'm also
talking companies like Hyperic, and whole groups like the postgresql.org.br.
This is a feature that people want, and unless there's something
fundamentally unstable about it, it seems really stupid to hold it back
because we're planning VACUUM improvements for 8.2.

AVitB has been on the TODO list for 2 versions. There's been 2 years to
question its position there. Now people are bringing up objections when
there's no time for discussion left? This stinks.

Complete agreement from me. Incremental improvements are good - pointing
out that there are some other incremental improvements that would also
be good to make is not an argument for delaying the first set of
incremental improvements.

In our case, we want to be able to install postgres at dozens (ideally
hundreds... no, thousands :) ) of customer sites, where the customers in
general are not going to have anyone onsite who has a clue about
postgres. The existing contrib autovacuum gives a good solution to
setting things up to maintain the database in a reasonable state of
health without need for further intervention from us. It's not perfect,
of course, but if it means the difference between having to unleash our
support team on a customer once a month and once a year, that's a good
deal for us. Having it integrated into the backend will make it much
easier for us, we (hopefully...) won't have to fiddle with extra startup
scripts, and we'll have one fewer point of failure (eg some customer
might accidentally turn off the separate pg_autovacuum daemon). Being
able to customise the autovacuum parameters on a per-table basis is also
attractive.

Just my AUD0.02. I realise that keeping _our_ customers happy is not
necessarily anyone else's priority. I'd like to be able to invest some
coding time, but can't. I haven't even gotten around to completing
Gavin's survey form (sorry Gav, I'll get to it soon, I hope! :)), so I
can't demand to be listened to.

But for what it's worth, Alvaro, please keep going, don't be dissuaded.

Tim

--
-----------------------------------------------
Tim Allen tim@proximity.com.au
Proximity Pty Ltd http://www.proximity.com.au/

#89Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#13)
Re: Autovacuum in the backend

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process. About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.

One reason of not using lib-pq is that this one has to wait for the
completion of each vacuum (we don't has async execution in libpq right?),
but by signaling does not.

But by signaling, we have to detect that if the forked backend successfully
done its job. I am not sure how to easily incorporate this into current
signaling framework.

Regards,
Qingqing

#90Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Gavin Sherry (#34)
Re: Autovacuum in the backend

Gavin Sherry wrote:

In January I was in Toronto with Jan, Tom and others and some ideas about
vacuum were being discussed. The basic idea is that when we dirty pages we
need we set a bit in a bitmap to say that the page has been dirty. A
convenient place to do this is when we are writing dirty buffers out to
disk. In many situations, this can happen inside the bgwriter meaning that
there should be little contention for this bitmap. Of course, individual
backends may be writing pages out and would have to account for the
dirty pages at that point.

Now this bitmap can occur on a per heap segment basis (ie, per 1 GB heap
file). You only need 2 pages for the bitmap to represent all the pages in
the segment, which is fairly nice. When vacuum is run, instead of visiting
every page, it would see which pages have been dirtied in the bitmap and
visit only pages. With large tables and small numbers of modified
tuples/pages, the effect this change would have would be pretty
impressive.

Added to TODO:

* Create a bitmap of pages that need vacuuming

Instead of sequentially scanning the entire table, have the background
writer or some other process record pages that have expired rows, then
VACUUM can look at just those pages rather than the entire table. In
the event of a system crash, the bitmap would probably be invalidated.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#91Thomas F. O'Connell
tfo@sitening.com
In reply to: Gavin Sherry (#86)
Re: Autovacuum in the backend

Gavin,

For the record, I don't consider myself a PostgreSQL newbie, nor do I
manage any 2 TB databases (much less tables), but I do have an
unusual production use case: thousands (> 10,000) of tables, many of
them inherited, and many of them with hundreds of thousands (a few
with millions) of rows.

Honestly, creating crontab vacuum management for this scenario would
be a nightmare, and pg_autovacuum has been a godsend. Considering the
recent revelations of O(n^2) iterations over table lists in the
current versions and the stated and apparent ease with which this
problem could be solved by integrating the basic functionality of
pg_autovacuum into the backend, I can personally attest to there
being real-world use cases that would benefit tremendously from
integrated autovacuum.

A few months ago, I attempted to solve the wrong problem by
converting a hardcoded threshold into another command-line option. If
I had spotted the O(n^2) problem, I might've spent the time working
on it then instead of the new command-line option. I suppose it's
possible that I'll head down this road anyway if it looks like
integrated pg_autovacuum is going to be put on hold indefinitely
after this discussion.

Anyway, just wanted to throw out some food for thought for the
practicality of a tool like pg_autovacuum.

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Jun 16, 2005, at 5:22 PM, Gavin Sherry wrote:

Show quoted text

On Thu, 16 Jun 2005, Alvaro Herrera wrote:

On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:

2) By no fault of its own, autovacuum's level of granularity is
the table
level. For people dealing with non-trivial amounts of data (and
we're not
talking gigabytes or terabytes here), this is a serious drawback.
Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is
to give
users the impression that it is solved and then vacuum their
tables during
peak periods. I cannot stress this enough.

People running systems with petabyte-sized tables can disable
autovacuum
for those tables, and leave it running for the rest. Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.

As I've said a few times, I'm not concerned about such users. I'm
concerned about users with some busy tables of a few hundred
megabytes. I
still don't think VACUUM at arbitary times on such tables is suitable.

Thanks,

Gavin

#92Joshua D. Drake
jd@commandprompt.com
In reply to: Josh Berkus (#46)
Re: Autovacuum in the backend

Josh Berkus wrote:

Josh,

Just my own two cents. First I am not knocking the work that has been on
autovacuum. I am sure that it was a leap on its own to get it to work.
However I will say that I just don't see the reason for it.

I've personally seen at least a dozen user requests for "autovacuum in the
backend", and had this conversation about 1,100 times:

NB: "After a week, my database got really slow."
Me: "How often are you running VACUUM ANALYZE?"
NB: "Running what?"

Can't argue that except... RTFM ;). I am not saying it doesn't have a
validity. I am just saying that if you actually pay attention to
PostgreSQL and maintain it, you don't need it ;)

Sincerely,

Joshua D. Drake

Show quoted text
#93Joshua D. Drake
jd@commandprompt.com
In reply to: Matthew T. O'Connor (#51)
Re: Autovacuum in the backend

Matthew T. O'Connor wrote:

Joshua D. Drake wrote:

Just my own two cents. First I am not knocking the work that has been
on autovacuum. I am sure that it was a leap on its own to get it to
work. However I will say that I just don't see the reason for it.

The major reasons for autovacuum as I see it are as follows:

* Reduces administrative overhead having to keep track of what tables
need to be vacuumed how often.

Creates more overhead and thus reduces performance.

* Reduces the total amount of time the system spends vacuuming since it
only vacuums when needed.

Can be easily done with cron.

* Keeps stats up-to-date automatically

Which can be done with cron

* Eliminates newbie confusion

RTFM

* Eliminates one of the criticisms that the public has against
PostgreSQL (justifed or not)

Agreed.

Just so everyone knows from the get go here. I am purposely playing a
little devils advocate. Autovacuum has some drawbacks. I think we should
be **publicly** aware of them before we pursue integration.

Heaven knows it would make my life easier if it was integrated but anyway...

Sincerely,

Joshua D. Drake

Show quoted text

Also, as VACUUM improves, autovacuum will improve with it.

#94Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Joshua D. Drake (#93)
Re: Autovacuum in the backend

* Reduces the total amount of time the system spends vacuuming since
it only vacuums when needed.

Can be easily done with cron.

* Keeps stats up-to-date automatically

Which can be done with cron

* Eliminates newbie confusion

RTFM

* Eliminates one of the criticisms that the public has against
PostgreSQL (justifed or not)

Agreed.

I few weeks ago I have set up a database with more than 1.800 tables
(some complex business thing). inventing a clever cron-vacuum strategy
is almost impossible (or at least very painful). there should be a
mechanism (fortunately there is pg_autovacuum) to make this a bit more
practical.

in case of small databases this is not an issue.
small is always simple. complex and large are the major challenges.

best regards,

hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at

#95Andreas Pflug
pgadmin@pse-consulting.de
In reply to: Qingqing Zhou (#89)
Re: Autovacuum in the backend

Qingqing Zhou wrote:

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process. About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.

One reason of not using lib-pq is that this one has to wait for the
completion of each vacuum (we don't has async execution in libpq right?),

There *is* async execution in libpq, and it works.

Regards,
Andreas

#96Russell Smith
mr-russ@pws.com.au
In reply to: Joshua D. Drake (#93)
Re: Autovacuum in the backend

The major reasons for autovacuum as I see it are as follows:

* Reduces administrative overhead having to keep track of what tables
need to be vacuumed how often.

Creates more overhead and thus reduces performance.

Or reduces vacuum overhead because the vacuum strategy is much better than
it was when you used cron. Especially as people get a chance to improve autovac.

* Reduces the total amount of time the system spends vacuuming since it
only vacuums when needed.

Can be easily done with cron.

Can you do partial table vacuums with CRON?
You can work out the smartest time to vacuum with cron? I thought it just scheduled tasks at certain times.

* Keeps stats up-to-date automatically

Which can be done with cron

An what is the management strategy for adjusting analyze when things change that you weren't aware of? (eg, big table changes that were unexpected)

* Eliminates newbie confusion

RTFM

RTFM = MySQL in a lot of cases to be honest.

* Eliminates one of the criticisms that the public has against
PostgreSQL (justifed or not)

Agreed.

This is really the same as the previous RTFM question/response. People criticise because vacuum is foreign to them,
and for newbie's that equals too hard, next db please. As much as it is a technical issue, it's an advocacy issue too.

Plus we finally get XID wraparound protection. We finally decided that for 8.1 we needed some protection, which I think
Tom committed. This again may be a newbie thing. But there are a lot of newbies out there then. We've see on the lists
and on IRC this problem pop up a number of times. And people say "Why didn't it tell me", RTFM it's exactly what they want
to hear, or the fact they thought they read the manual, and missed understanding that bit.

Just so everyone knows from the get go here. I am purposely playing a
little devils advocate. Autovacuum has some drawbacks. I think we should
be **publicly** aware of them before we pursue integration.

It does have a number of issues. But I feel the integration issue is being addressed with a very short term view.
Once it's integrated there are a lot of patches, tweaks and changes that just can't be made until it is integrated.
The usefulness of some of the vacuum ideas that have been presented in the past will be able to become a reality.
The dead space map is a perfect example. People have talked about it for most of the time I've been around.
But until we have an integrated vacuum none of that can really happen.

Heaven knows it would make my life easier if it was integrated but anyway...

I understand these are not nessecarily Josh's view, but I thought I would offer comments on them.

Sincerely,

Joshua D. Drake

Regards

Russell Smith

Also, as VACUUM improves, autovacuum will improve with it.

Or because of autovacuum, vacuum and autovacuum will improve.

#97Russell Smith
mr-russ@pws.com.au
In reply to: Bruce Momjian (#90)
Re: Autovacuum in the backend

Added to TODO:

* Create a bitmap of pages that need vacuuming

Instead of sequentially scanning the entire table, have the background
writer or some other process record pages that have expired rows, then
VACUUM can look at just those pages rather than the entire table. In
the event of a system crash, the bitmap would probably be invalidated.

Further to this, is there any use case for allowing FSM, or this DSM to spill to disk
if the space fills up. It would allow the possibility of unusual changes to the db
to not loose space. You could just load part of the overflow from the disk back
int the FSM in memory and continue using free space.

Regards

Russell Smith

#98Russell Smith
mr-russ@pws.com.au
In reply to: Matthew T. O'Connor (#67)
Re: Autovacuum in the backend

4) Related to this, I guess, is that a user's FSM settings might be
completely inappropriate. The 'Just read the manual' or 'Just read the
logs' argument doesn't cut it, because the main argument for autovacuum in
the backend is that people do not and will not.

Agreed, it doesn't solve all problems, and I'm not arguing that the
integration of AV makes PostgreSQL newbie safe it just helps reduce the
newbie problem. Again if the default FSM settings are inappropriate
for a database then the user is probably doing something more
complicated that a "my cat minka" database and will need to learn some
tuning skills anyway.

5) It doesn't actually shrink tables -- ie, there's no VACUUM FULL. If
we're telling users about VACUUM less often than we are now, there's bound
to be bloating issues (see 4).

But what's stopping the implementation of a Partial VACUUM FULL, where we lock the table,
move enough blocks to shorten the relation so that there is say < 10% bloat, or whatever is
appropriate for that table. Or even just short the table a few block, and repeat the process
when you have some time too.

Not totally true, regular VACUUM can shrink tables a little (I think
only if there is free space at the end of the table it can cutoff
without moving data around). But if AV is on and the settings are
reasonable, then a table shouldn't bloat much or at all. Also, I don't
think we are telling people to VACUUM less, in fact tables that need it
will usually get VACUUM'd more, we are just telling the users that if
they turn AV on, they don't have to manage all the VACUUMing.

Regards

Russell Smith

#99Russell Smith
mr-russ@pws.com.au
In reply to: Andreas Pflug (#95)
Re: Autovacuum in the backend

On Fri, 17 Jun 2005 06:26 pm, Andreas Pflug wrote:

Qingqing Zhou wrote:

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process. About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.

One reason of not using lib-pq is that this one has to wait for the
completion of each vacuum (we don't has async execution in libpq right?),

There *is* async execution in libpq, and it works.

I would have thought the main reasons for not using libpq means you are locked
into only using commands that are available to all users via SQL. If you don't use
libpq, you open up the ability to use functions that can make use of information available
to the backend, and to also run functions in a way that it is not possible to do via SQL.

Regards

Russell Smith.

#100Gavin Sherry
swm@linuxworld.com.au
In reply to: Russell Smith (#97)
Re: Autovacuum in the backend

On Fri, 17 Jun 2005, Russell Smith wrote:

Added to TODO:

* Create a bitmap of pages that need vacuuming

Instead of sequentially scanning the entire table, have the background
writer or some other process record pages that have expired rows, then
VACUUM can look at just those pages rather than the entire table. In
the event of a system crash, the bitmap would probably be invalidated.

Further to this, is there any use case for allowing FSM, or this DSM to spill to disk
if the space fills up. It would allow the possibility of unusual changes to the db
to not loose space. You could just load part of the overflow from the disk back
int the FSM in memory and continue using free space.

FSM splilling to disk would be a problem. The reason is that when we need
to allocate an empty page, we hit the FSM first. If that operation becomes
disk bound, large updates and inserts are going to really suck from a
performance point of view.

The idea I discussed is disk backed, because its the first few pages of
every heap segment. This map doesn't mean that pages are free. It means
they've been modified.

Gavin

#101Gavin Sherry
swm@linuxworld.com.au
In reply to: Russell Smith (#98)
Re: Autovacuum in the backend

On Fri, 17 Jun 2005, Russell Smith wrote:

4) Related to this, I guess, is that a user's FSM settings might be
completely inappropriate. The 'Just read the manual' or 'Just read the
logs' argument doesn't cut it, because the main argument for autovacuum in
the backend is that people do not and will not.

Agreed, it doesn't solve all problems, and I'm not arguing that the
integration of AV makes PostgreSQL newbie safe it just helps reduce the
newbie problem. Again if the default FSM settings are inappropriate
for a database then the user is probably doing something more
complicated that a "my cat minka" database and will need to learn some
tuning skills anyway.

5) It doesn't actually shrink tables -- ie, there's no VACUUM FULL. If
we're telling users about VACUUM less often than we are now, there's bound
to be bloating issues (see 4).

But what's stopping the implementation of a Partial VACUUM FULL, where we lock the table,
move enough blocks to shorten the relation so that there is say < 10% bloat, or whatever is
appropriate for that table. Or even just short the table a few block, and repeat the process
when you have some time too.

Its a question of where you start off from again. You cannot just say
'I've vacuumed the first 100 pages' because it could well have changed
underneath you.

Gavin

#102Matthew T. O'Connor
matthew@zeut.net
In reply to: Joshua D. Drake (#92)
Re: Autovacuum in the backend

Joshua D. Drake wrote:

Josh Berkus wrote:

I've personally seen at least a dozen user requests for "autovacuum
in the backend", and had this conversation about 1,100 times:

NB: "After a week, my database got really slow."
Me: "How often are you running VACUUM ANALYZE?"
NB: "Running what?"

Can't argue that except... RTFM ;). I am not saying it doesn't have a
validity. I am just saying that if you actually pay attention to
PostgreSQL and maintain it, you don't need it ;)

I think everyone on this list would agree with you. The only reason I
think the newbie protection is important (and I don't think it's the
most important reason for autovacuum) is that perception is reality to
some extent. Valid or not we still suffer from a reputation of being
more complicated and slower than mysql. Steps towards reducing /
eliminating that perception can only be good for us as I think lots of
developers make their first database decision based solely on their
perceptions and then just stick with what they know.

#103Matthew T. O'Connor
matthew@zeut.net
In reply to: Joshua D. Drake (#93)
Re: Autovacuum in the backend

Joshua D. Drake wrote:

Matthew T. O'Connor wrote:

The major reasons for autovacuum as I see it are as follows:

* Reduces administrative overhead having to keep track of what tables
need to be vacuumed how often.

Creates more overhead and thus reduces performance.

In the general case, I disagree. Overall having your tables vacuumed
and analyzed only when needed and never when not needed can only reduce
system overhead. Granted there are limitations in the contrib version
of autovacuum, some of which go away in the integrated case.

* Reduces the total amount of time the system spends vacuuming since
it only vacuums when needed.

Can be easily done with cron.

Really? What happens when your load / usage patterns change? When a
table is added that gets heavily used?

* Keeps stats up-to-date automatically

Which can be done with cron

Same response as above.

* Eliminates newbie confusion

RTFM

;-)

* Eliminates one of the criticisms that the public has against
PostgreSQL (justifed or not)

Agreed.

Just so everyone knows from the get go here. I am purposely playing a
little devils advocate. Autovacuum has some drawbacks. I think we should
be **publicly** aware of them before we pursue integration.

Understood.

#104Matthew T. O'Connor
matthew@zeut.net
In reply to: Russell Smith (#96)
Re: Autovacuum in the backend

Russell Smith wrote:

* Reduces the total amount of time the system spends vacuuming since it
only vacuums when needed.

Can be easily done with cron.

Can you do partial table vacuums with CRON?
You can work out the smartest time to vacuum with cron? I thought it just scheduled tasks at certain times.

To be fair, autovacuum can't do partial table vacuums either, in fact
nothing can right now. Perhaps someday something like this will be
feasible.

* Eliminates one of the criticisms that the public has against
PostgreSQL (justifed or not)

Agreed.

This is really the same as the previous RTFM question/response. People criticise because vacuum is foreign to them,
and for newbie's that equals too hard, next db please. As much as it is a technical issue, it's an advocacy issue too.

This bullet point is absolutely an advocacy issue. Every developer that
says "next db please" will probably not come back to PostgreSQL for
quite some time, thus bolstering the userbase of the competition and
reducing the userbase of PostgreSQL.

Plus we finally get XID wraparound protection. We finally decided that for 8.1 we needed some protection, which I think
Tom committed. This again may be a newbie thing. But there are a lot of newbies out there then. We've see on the lists
and on IRC this problem pop up a number of times. And people say "Why didn't it tell me", RTFM it's exactly what they want
to hear, or the fact they thought they read the manual, and missed understanding that bit.

I think this point hasn't been stressed enough. With nested
transactions these days (not to mention faster hardware) I can see XID
wraparound becoming a much bigger issue.

#105Matthew T. O'Connor
matthew@zeut.net
In reply to: Russell Smith (#99)
Re: Autovacuum in the backend

Russell Smith wrote:

On Fri, 17 Jun 2005 06:26 pm, Andreas Pflug wrote:

Qingqing Zhou wrote:

One reason of not using lib-pq is that this one has to wait for the
completion of each vacuum (we don't has async execution in libpq right?),

There *is* async execution in libpq, and it works.

I would have thought the main reasons for not using libpq means you are locked
into only using commands that are available to all users via SQL. If you don't use
libpq, you open up the ability to use functions that can make use of information available
to the backend, and to also run functions in a way that it is not possible to do via SQL.

Right now we don't really want to fire off more than one VACUUM at a
time since it will create a more substantial IO storm issue than we
already have with vacuum. Perhaps with the introduction of vacuum delay
settings and table spaces we could / should rethink this, but for now
it's the easiest way to go.

As for the standard SQL issue, FSM data (or anything else we might want)
could be exported via regular SQL via some type of super-user only
system function. So that isn't really the issue. I don't remember all
the details but you can look at the discussion when my patch was
rejected (around July of 2004). People just didn't like including libpq
into the backend for reasons I don't remember. I don't *think* this is
up for discussion, I *think* autovacuum has to work without libpq if it
is going to be accepted.

Matthew

#106Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew T. O'Connor (#105)
Re: Autovacuum in the backend

"Matthew T. O'Connor" <matthew@zeut.net> writes:

... People just didn't like including libpq
into the backend for reasons I don't remember.

One reason I can think of is that there would be global-symbol conflicts
--- libpq has copies of some backend routines, but they are not
identical.

In any case, the argument that's being made here is that an integrated
autovac would be simple and newbie-friendly. I think it's impossible
for a libpq-using autovac to ever be invisible to the DBA, if only
because he has to configure it with an appropriate username/password,
make sure pg_hba.conf will actually let it into every database, etc.
There are way too many foot-guns in that scenario for my taste.

regards, tom lane

In reply to: Joshua D. Drake (#93)
Re: Autovacuum in the backend

On Fri, 2005-06-17 at 00:03 -0700, Joshua D. Drake wrote:

Matthew T. O'Connor wrote:

Joshua D. Drake wrote:

Just my own two cents. First I am not knocking the work that has been
on autovacuum. I am sure that it was a leap on its own to get it to
work. However I will say that I just don't see the reason for it.

The major reasons for autovacuum as I see it are as follows:

* Reduces administrative overhead having to keep track of what tables
need to be vacuumed how often.

Creates more overhead and thus reduces performance.

Agreed, except I have a number of tables that are over scheduled with
cron because about once a week (at a random time) they do go through
heavy enough churn to require the more frequent vacuum, but the rest of
the time they're fairly quiet. It's not ideal, but autovacuum was the
difference between a 10 minute vacuum 24/7, or a 10 minute vacuum on the
table for the heavy 2 or so hours which randomly appear.

In the case of unexpected or uneven loads, cron isn't particularly
useful.

--

#108Josh Berkus
josh@agliodbs.com
In reply to: Joshua D. Drake (#93)
Re: Autovacuum in the backend

Josh,

Just so everyone knows from the get go here. I am purposely playing a
little devils advocate.

Well, please stop it. We discussed AV over a year ago when we ran out of time
to integrate it with 8.0. This disucussion now is hindering any discussion
of what needs to be *done* to integrate it. This isn't a debating society.

Folks, I'm sorry to be so grumpy about this, but so far 80% of the posts on
this thread have been re-arguing a discussion we had in 2004. Which isn't
helping Alvaro get anything done.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#109Matthew T. O'Connor
matthew@zeut.net
In reply to: Christopher Browne (#65)
Re: Autovacuum in the backend

Christopher Browne wrote:

swm@linuxworld.com.au (Gavin Sherry) wrote:

I guess the main point is, if something major like this ships in the
backend it says to users that the problem has gone away. pg_autovacuum is
a good contrib style solution: it addresses a problem users have and
attempts to solve it the way other users might try and solve it. When you
consider it in the backend, it looks like a workaround. I think users are
better served by solving the real problem.

Hear, hear!

It seems to me that the point in time at which it is *really*
appropriate to put this into the backend is when the new GUC variable
"dead_tuple_map_size" (akin to FSM) is introduced, and there is a new
sort of 'VACUUM DEAD TUPLES' command which goes through the DTPM (Dead
Tuple Page Map).

In THAT case, there would be the ability to do a VACUUM on the "dead
bits" of the table that consists of 50M rows without having to go
through the 49M rows that haven't been touched in months.

This will make VACUUM less painful, but it doesn't eliminate the need /
desire for autovacuum. I agree this would be good, but I see it as a
separate issue.

#110Jim C. Nasby
decibel@decibel.org
In reply to: Matthew T. O'Connor (#109)
Re: Autovacuum in the backend

On Fri, Jun 17, 2005 at 12:21:44PM -0400, Matthew T. O'Connor wrote:
<snip more stuff about how we need to track pages with dead tuples>

This will make VACUUM less painful, but it doesn't eliminate the need /
desire for autovacuum. I agree this would be good, but I see it as a
separate issue.

Not only is it a seperate issue, but there's also no way it could
possibly be done for 8.1, whereas autovacuum most likely will make it
into 8.1. Additionally, there are noted improvements that come about by
putting autovacuum in the backend instead of leaving it in contrib. And
as others have mentioned numerous times, any improvements made to vacuum
will help out vacuum as well. There simply isn't a downside to putting
it in the backend that anyone's brought up.

Autovacuum was originally scheduled for 8.0. There's been plans to put
it in the backend for close to 2 years now. There's no reason at all to
push it out any farther.
--
Jim C. Nasby, Database Consultant decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

#111Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Gavin Sherry (#34)
Space reuse and autovacuum

Gavin Sherry wrote:

On Wed, 15 Jun 2005, Bruce Momjian wrote:

I am going to start working on it. I am concerned it is a big job.

I will post questions as I find them, and the one below is a good one.

I'm wondering if effort is being misdirected here. I remember when Mark
Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
significant performance loss -- I think on the order of 30% to 40% (I will
try and dig up a link to the results).

I think these results can be dramatically improved if the focus is on a
more effective vacuum.

Let's look at what TODO has for vacuum and how autovacuum fits that:

Vacuum
======

* Improve speed with indexes

For large table adjustements during vacuum, it is faster to reindex
rather than update the index.

This is something we should figure out how to do automatically.

* Reduce lock time by moving tuples with read lock, then write
lock and truncate table

Moved tuples are invisible to other backends so they don't require a
write lock. However, the read lock promotion to write lock could lead
to deadlock situations.

The deadlock problem here seems bad.

* -Add a warning when the free space map is too small

Done.

* Maintain a map of recently-expired rows

This allows vacuum to target specific pages for possible free space
without requiring a sequential scan.

I think of this as a secondary Free-space-map (FSM), where instead of
recording rows/pages that have free space, we records rows/pages that
have expired rows that might be free for reuse if all transactions where
the are visible are completed.

* Auto-fill the free space map by scanning the buffer cache or by
checking pages written by the background writer

This could be used to populate the secondary FSM above.

* Create a bitmap of pages that need vacuuming

Instead of sequentially scanning the entire table, have the background
writer or some other process record pages that have expired rows, then
VACUUM can look at just those pages rather than the entire table. In
the event of a system crash, the bitmap would probably be invalidated.

This is an alternative to the FSM that tracks _all_ possible free space
rather than just a limited amount like a seconary FSM in shared memory.

* Auto-vacuum
o Move into the backend code
o Use free-space map information to guide refilling
o Do VACUUM FULL if table is nearly empty?

It seems no matter what TODO items we complete above, we will need some
type of automatic vacuum to direct filling the free space map. It might
be done using a different method than a sequential scan vacuum, but it
will be needed, so we are good to integrate autovacuum then improve how
it does its job in future releases.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073