Re: CPU-intensive autovacuuming
Following up on my own post from last night:
Could it be that there is some code in autovacuum that is O(n^2) in
the number of tables?
Browsing the code using webcvs, I have found this:
for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{
I haven't really tried to understand what is going on in here, but it
does look like it is getting the result of the "pg_class join stats"
query and then matching it up against its internal list of tables using
nested loops, which is undoubtedly O(n^2) in the number of tables.
Have I correctly understood what is going on here?
--Phil.
Phil Endecott wrote:
Following up on my own post from last night:
Could it be that there is some code in autovacuum that is O(n^2) in
the number of tables?Browsing the code using webcvs, I have found this:
for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{I haven't really tried to understand what is going on in here, but it
does look like it is getting the result of the "pg_class join stats"
query and then matching it up against its internal list of tables
using nested loops, which is undoubtedly O(n^2) in the number of tables.Have I correctly understood what is going on here?
Indeed you have. I have head a few similar reports but perhaps none as
bad as yours. One person put a small sleep value so that it doesn't
spin so tight. You could also just up the sleep delay so that it
doesn't do this work quite so often. No other quick suggestions.
Matthew T. O'Connor wrote:
Phil Endecott wrote:
Could it be that there is some code in autovacuum that is O(n^2) in
the number of tables?Browsing the code using webcvs, I have found this:
for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{
Have I correctly understood what is going on here?
Indeed you have. I have head a few similar reports but perhaps none as
bad as yours. One person put a small sleep value so that it doesn't
spin so tight. You could also just up the sleep delay so that it
doesn't do this work quite so often. No other quick suggestions.
I do wonder why autovacuum is keeping its table list in memory rather
than in the database.
But given that it is keeping it in memory, I think the real fix is to
sort that list (or keep it ordered when building or updating it). It is
trivial to also get the query results ordered, and they can then be
compared in O(n) time.
I notice various other places where there seem to be nested loops, e.g.
in the update_table_list function. I'm not sure if they can be fixed by
similar means.
--Phil.
Phil Endecott wrote:
Matthew T. O'Connor wrote:
Indeed you have. I have head a few similar reports but perhaps none
as bad as yours. One person put a small sleep value so that it
doesn't spin so tight. You could also just up the sleep delay so
that it doesn't do this work quite so often. No other quick
suggestions.I do wonder why autovacuum is keeping its table list in memory rather
than in the database.
For better or worse, this was a conscious design decision that the
contrib version of autovacuum be non-invasive to your database.
But given that it is keeping it in memory, I think the real fix is to
sort that list (or keep it ordered when building or updating it). It
is trivial to also get the query results ordered, and they can then be
compared in O(n) time.
I'm quite sure there is a better way, please submit a patch if you can.
This was never a real concern for most people since the number of tables
is typically small enough not to be a problem. The integrated version
of autovacuum that didn't make the cut before 8.0 avoids this problem
since the autovacuum data is stored in the database.
I notice various other places where there seem to be nested loops,
e.g. in the update_table_list function. I'm not sure if they can be
fixed by similar means.
I would think so, they all basically do the same type of loop.
Matthew T. O'Connor wrote:
The integrated version
of autovacuum that didn't make the cut before 8.0 avoids this problem
since the autovacuum data is stored in the database.
What is the status of this? Is it something that will be included in
8.1 or 8.0.n? I might be able to patch the current code but that
doesn't seem like a useful thing to do if a better solution will arrive
eventually. I am currently running vacuums from a cron job and I think
I will be happy with that for the time being.
(Incidentally, I have also found that the indexes on my pg_attributes
table were taking up over half a gigabyte, which came down to less than
40 megs after reindexing them. Is there a case for having autovacuum
also call reindex?)
--Phil.
Phil Endecott wrote:
Matthew T. O'Connor wrote:
The integrated version of autovacuum that didn't make the cut before
8.0 avoids this problem since the autovacuum data is stored in the
database.What is the status of this? Is it something that will be included in
8.1 or 8.0.n? I might be able to patch the current code but that
doesn't seem like a useful thing to do if a better solution will
arrive eventually. I am currently running vacuums from a cron job and
I think I will be happy with that for the time being.
This is a good question :-) I have been so busy with work lately that I
have not been able to work on it. I am currently trying to resurrect
the patch I sent in for 8.0 and update it so that it applies against
HEAD. Once that is done, I will need help from someone with the
portions of the work that I'm not comfortable / capable of. The main
issue with the version I created during the 8.0 devel cycle it used
libpq to connect, query and issue commands against the databases. This
was deemed bad, and I need help setting up the infrastructure to make
this happen without libpq. I hope to have my patch applying against
HEAD sometime this week but it probably won't happen till next week.
So the summary of the autovacuum integration status is that we are fast
running out of time (feature freeze July 1), and I have very little time
to devote to this task. So you might want to submit your O(n) patch
cause unfortunately it looks like integrated autovacuum might slip
another release unless someone else steps up to work on it.
(Incidentally, I have also found that the indexes on my pg_attributes
table were taking up over half a gigabyte, which came down to less
than 40 megs after reindexing them. Is there a case for having
autovacuum also call reindex?)
Yes there is certainly some merit to having autovacuum or something
similar perform other system maintenance tasks such as reindexing. I
just haven't taken it there yet. It does seem strange that your
pg_attributes table go that big, anyone have any insight here? You did
say you are using 7.4.2, I forget it that has the index reclaiming code
in vacuum, also there are some autovacuum bugs in the early 7.4.x
releases. You might try to upgrade to either 8.0.x or a later 7.4.x
release.
Matthew O'Connor
Phil Endecott wrote:
Matthew T. O'Connor wrote:
The integrated version
of autovacuum that didn't make the cut before 8.0 avoids this problem
since the autovacuum data is stored in the database.What is the status of this? Is it something that will be included in
8.1 or 8.0.n? I might be able to patch the current code but that
doesn't seem like a useful thing to do if a better solution will arrive
eventually. I am currently running vacuums from a cron job and I think
I will be happy with that for the time being.
I will post about integrating pg_autovacuum into the backend for 8.1 in
a few minutes.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Phil Endecott <spam_from_postgresql_general@chezphil.org> writes:
(Incidentally, I have also found that the indexes on my pg_attributes
table were taking up over half a gigabyte, which came down to less than
40 megs after reindexing them. Is there a case for having autovacuum
also call reindex?)
Lots of temp tables I suppose? If so that's not autovacuum's fault;
it wasn't getting told about the activity in pg_attribute until this
patch:
2005-03-31 18:20 tgl
* src/backend/postmaster/: pgstat.c (REL7_4_STABLE), pgstat.c
(REL8_0_STABLE), pgstat.c: Flush any remaining statistics counts
out to the collector at process exit. Without this, operations
triggered during backend exit (such as temp table deletions) won't
be counted ... which given heavy usage of temp tables can lead to
pg_autovacuum falling way behind on the need to vacuum pg_class and
pg_attribute. Per reports from Steve Crawford and others.
Unless the bloat occurred after you updated to 8.0.2, there's no issue.
regards, tom lane
Phil,
If you complete this patch, I'm very interested to see it.
I think I'm the person Matthew is talking about who inserted a sleep
value. Because of the sheer number of tables involved, even small
values of sleep caused pg_autovacuum to iterate too slowly over its
table lists to be of use in a production environment (where I still
find its behavior to be preferable to a complicated list of manual
vacuums performed in cron).
--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
Strategic Open Source: Open Your i™
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005
On Jun 7, 2005, at 6:16 AM, Phil Endecott wrote:
Show quoted text
Matthew T. O'Connor wrote:
Phil Endecott wrote:
Could it be that there is some code in autovacuum that is O
(n^2) in
the number of tables?
Browsing the code using webcvs, I have found this:
for (j = 0; j < PQntuples(res); j++)
{
tbl_elem = DLGetHead(dbs->table_list);
while (tbl_elem != NULL)
{ Have I correctly understood what is going on here?Indeed you have. I have head a few similar reports but perhaps
none as bad as yours. One person put a small sleep value so that
it doesn't spin so tight. You could also just up the sleep delay
so that it doesn't do this work quite so often. No other quick
suggestions.I do wonder why autovacuum is keeping its table list in memory
rather than in the database.But given that it is keeping it in memory, I think the real fix is
to sort that list (or keep it ordered when building or updating
it). It is trivial to also get the query results ordered, and they
can then be compared in O(n) time.I notice various other places where there seem to be nested loops,
e.g. in the update_table_list function. I'm not sure if they can
be fixed by similar means.--Phil.
--- "Thomas F. O'Connell" <tfo@sitening.com> wrote:
Phil,
If you complete this patch, I'm very interested to see it.
I think I'm the person Matthew is talking about who inserted a sleep
value. Because of the sheer number of tables involved, even small
values of sleep caused pg_autovacuum to iterate too slowly over its
table lists to be of use in a production environment (where I still
find its behavior to be preferable to a complicated list of manual
vacuums performed in cron).--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
Were you sleeping every time through the loop? How about something
like:
if (j%500 == 1) usleep(100000)
Regards,
Shelby Cain
__________________________________
Discover Yahoo!
Stay in touch with email, IM, photo sharing and more. Check it out!
http://discover.yahoo.com/stayintouch.html
I was usleeping in tiny increments in each iteration of the loop. I
didn't try break it into iterative groups like this.
Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather
than O(n^2) table activity. At this point, though, I'm probably not
too likely to have much time to hack pg_autovacuum before 8.1 is
released, although if it doesn't become integrated by beta feature
freeze, I might give it a shot.
But I hope if anyone completes the linear improvement, they'll post
to the lists.
--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
Strategic Open Source: Open Your i™
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005
On Jun 10, 2005, at 9:12 AM, Shelby Cain wrote:
Show quoted text
--- "Thomas F. O'Connell" <tfo@sitening.com> wrote:Were you sleeping every time through the loop? How about something
like:if (j%500 == 1) usleep(100000)
Regards,
Shelby Cain
Hi everybody
I am trying to write a stored procedure that returns a result set but it is
not working
this is the function:
///
CREATE OR REPLACE FUNCTION
remisiones.fn_get_total_remitidoxprovision1("numeric")
RETURNS SETOF record AS
$BODY$
begin
select rm.provision as provision,
drm.producto as producto,
sum(drm.cantidad) as cantidad
FROM remisiones.remisiones rm, remisiones.detalles_remision drm
WHERE rm.remision = drm.remision and rm.provision = $1
GROUP BY rm.provision, drm.producto
ORDER BY rm.provision, drm.producto;
end;$BODY$
///
If I call this function from the interactive sql of pgadminIII I get this
result:
select * from fn_gert_total_remitidosxprovision(1)
---------------------------------------------------------------------------
row refcursor
1 <unnamed porta1>
is there a way to display the value of the rows returned, i need it becouse
I need to use it in a Datawindow definition in an Powerbuilder app.
thanks in advance
Hugo
"Thomas F. O'Connell" <tfo@sitening.com> writes:
Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather
than O(n^2) table activity. At this point, though, I'm probably not
too likely to have much time to hack pg_autovacuum before 8.1 is
released, although if it doesn't become integrated by beta feature
freeze, I might give it a shot.
This would be vastly easier to fix if the code were integrated into the
backend first. In the backend environment you could just keep the info
in a dynahash.c hashtable instead of in a linear list. On the client
side, you have to roll your own hashing (or adapt dynahash to life
outside the backend environment).
regards, tom lane
One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:
o move it into the backend and have it start/stop automatically
o move the autovacuum configuration parameters into postgresql.conf
o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashes
Who would like to get started on this? It seems pretty straight-forward.
---------------------------------------------------------------------------
Tom Lane wrote:
"Thomas F. O'Connell" <tfo@sitening.com> writes:
Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather
than O(n^2) table activity. At this point, though, I'm probably not
too likely to have much time to hack pg_autovacuum before 8.1 is
released, although if it doesn't become integrated by beta feature
freeze, I might give it a shot.This would be vastly easier to fix if the code were integrated into the
backend first. In the backend environment you could just keep the info
in a dynahash.c hashtable instead of in a linear list. On the client
side, you have to roll your own hashing (or adapt dynahash to life
outside the backend environment).regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes:
One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:
o move it into the backend and have it start/stop automatically
o move the autovacuum configuration parameters into postgresql.conf
o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashes
Who would like to get started on this? It seems pretty straight-forward.
A small problem here is that until you get at least to step 3
(backend-standard error handling), none of it is going to be acceptable
to commit. So I don't entirely buy Bruce's notion of bite-size pieces
of work. You can certainly work on it in that fashion, but it's not
going into 8.1 unless most of the above is done by the end of the month.
regards, tom lane
"Bruce Momjian" <pgman@candle.pha.pa.us> writes
One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:o move it into the backend and have it start/stop automatically
The start/stop routine is quite like Bgwriter. It requires PgStats to be
turned on. If it aborts unexpectedly, hopefully we could restart it. Shall
we have a RequestVacuum() to pass the control to this process so to avoid
possible redundant vacuums from user side?
o move the autovacuum configuration parameters into postgresql.conf
There are some correlations of GUC parameters in order to incorporate it:
* stats_start_collector = true
* stats_row_level = true
add a parameter to let user pass in the configuration parameters:
* autovacuum_command = "-s 100 -S 1 ..."
So if autovacuum_command is given, we will automatically set the upper two
parameters true?
o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashes
Change "connect/disconnect server" to "start/stop autovacuum process";
Change "execute query" to "backend APIs";
Change "list" to "hash";
Need think more to handle various error conditions ...
Who would like to get started on this? It seems pretty straight-forward.
I'd like to give it a try.
Regards,
Qingqing
"Tom Lane" <tgl@sss.pgh.pa.us> writes
A small problem here is that until you get at least to step 3
(backend-standard error handling), none of it is going to be acceptable
to commit. So I don't entirely buy Bruce's notion of bite-size pieces
of work. You can certainly work on it in that fashion, but it's not
going into 8.1 unless most of the above is done by the end of the month.
Scared ...
Regards,
Qingqing
Qingqing Zhou wrote:
The start/stop routine is quite like Bgwriter. It requires PgStats to be
turned on.
Wasn't the plan to rewrite pg_autovacuum to use the FSM rather than the
stats collector?
-Neil
"Neil Conway" <neilc@samurai.com> writes
Wasn't the plan to rewrite pg_autovacuum to use the FSM rather than the
stats collector?
I don't understand. Currently the basic logic of pg_autovacuum is to use the
pg_stat_all_tables numbers like n_tup_upd, n_tup_del to determine if a
relation need to be vacuumed. How to use FSM to get these information?
Regards,
Qingqing
On T, 2005-06-14 at 21:23 -0400, Bruce Momjian wrote:
One goal for 8.1 is to move /contrib/pg_autovacuum in to the backend. I
think it has to be done in four stages:o move it into the backend and have it start/stop automatically
o move the autovacuum configuration parameters into postgresql.conf
o modify the code to use the backend API for error recovery
o modify the code to use the backend API utilities, like hashesWho would like to get started on this? It seems pretty straight-forward.
Can autovacuum yet be configured _not_ to run vacuum during some hours
or above some load ?
Even better - to stop or pause a long-running vacuum if load goes above
some limit.
If it goes into backend before the above is done, it should at least be
possible to switch it off completely.
--
Hannu Krosing <hannu@skype.net>