VACUUM and ANALYZE With Empty Tables

Started by Mark Dexterover 21 years ago4 messagesgeneral
Jump to latest
#1Mark Dexter
MDEXTER@dexterchaney.com

We use a development environment that works with Postgres via ODBC and
uses cursors to insert and update rows in Postgres tables. I'm using
Postgres version 7.4.5.

I have a test program that reads 34,000 rows from an external file and
inserts them into a Postgres table. Under normal circumstances, it
takes about 1 minute to complete the test. In troubleshooting a
performance problem, I have discovered the following:

A. If I TRUNCATE or DELETE all of the rows in the table and then run
VACUUM or ANALYZE on the empty table, the test program takes over 15
minutes to complete (i.e., 15X performance drop).
B. If I drop and create the table without using VACUUM or ANALYZE, it
takes about 1 minute to complete the test program.
C. If I insert 94 or more rows into the table and then run VACUUM or
ANALYZE, it takes 1 minute to complete the test program.
D. If I insert 93 or fewer rows into the table and then run VACUUM or
ANALYZE, it takes over 15 minutes to complete the test.
D. If the test program is running slowly, I can speed it up to normal
speed by running ANALYZE <table> from another database session.

My concern about this is as follows. Our application uses a number of
"work" tables that will have many rows inserted into them during a
process (e.g., a user running a report) and then the rows will be
deleted once the process is over. (I don't think we can use TEMPORARY
tables because the tables need to be seen outside of the current
database session.)

If we routinely run VACUUM or VACUUM ANALYZE (e.g., nightly), these work
tables will normally be empty when the VACUUM is run. So it would
appear from the testing above that they will experience performance
problems when inserting large numbers of rows through our application.

Is there some easy way around this problem? If there a way to force
VACUUM or ANALYZE to optimize for a set number of rows even if the table
is empty when it is run? Thanks for your help. Mark

#2Richard Huxton
dev@archonet.com
In reply to: Mark Dexter (#1)
Re: VACUUM and ANALYZE With Empty Tables

Mark Dexter wrote:

We use a development environment that works with Postgres via ODBC and
uses cursors to insert and update rows in Postgres tables. I'm using
Postgres version 7.4.5.

A. If I TRUNCATE or DELETE all of the rows in the table and then run
VACUUM or ANALYZE on the empty table, the test program takes over 15
minutes to complete (i.e., 15X performance drop).

If we routinely run VACUUM or VACUUM ANALYZE (e.g., nightly), these work
tables will normally be empty when the VACUUM is run. So it would
appear from the testing above that they will experience performance
problems when inserting large numbers of rows through our application.

Yep - it's a known issue. The analyse is doing what you asked, it's just
not what you want.

Is there some easy way around this problem? If there a way to force
VACUUM or ANALYZE to optimize for a set number of rows even if the table
is empty when it is run? Thanks for your help. Mark

There are only two options I know of:
1. Vaccum analyse each table separately (tedious, I know)
2. Try pg_autovacuum in the contrib/ directory

The autovacuum utility monitors activity for you and targets tables when
they've seen a certain amount of activity. Even if it hasn't got the
tunability you need, it should be a simple patch to add a list of
"excluded" tables.

--
Richard Huxton
Archonet Ltd

#3Mark Dexter
MDEXTER@dexterchaney.com
In reply to: Richard Huxton (#2)
Re: VACUUM and ANALYZE With Empty Tables

Thanks very much for the information. It would appear that our best option might be to vacuum analyze these tables in our application at a point in time when they contain rows instead of doing it at night. Needlesst to say, it would nice to have an option to analyze with a target number of rows instead of the number presently in the table.

I suppose another option would be to keep a small number of rows permanently in these tables. In my testing, 100 rows (94 to be exact) did the trick. Is this number going to vary from table to table?

Thanks again for your help. Mark

________________________________

From: Richard Huxton [mailto:dev@archonet.com]
Sent: Wed 11/24/2004 1:26 AM
To: Mark Dexter
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] VACUUM and ANALYZE With Empty Tables

Mark Dexter wrote:

We use a development environment that works with Postgres via ODBC and
uses cursors to insert and update rows in Postgres tables. I'm using
Postgres version 7.4.5.

A. If I TRUNCATE or DELETE all of the rows in the table and then run
VACUUM or ANALYZE on the empty table, the test program takes over 15
minutes to complete (i.e., 15X performance drop).

If we routinely run VACUUM or VACUUM ANALYZE (e.g., nightly), these work
tables will normally be empty when the VACUUM is run. So it would
appear from the testing above that they will experience performance
problems when inserting large numbers of rows through our application.

Yep - it's a known issue. The analyse is doing what you asked, it's just
not what you want.

Is there some easy way around this problem? If there a way to force
VACUUM or ANALYZE to optimize for a set number of rows even if the table
is empty when it is run? Thanks for your help. Mark

There are only two options I know of:
1. Vaccum analyse each table separately (tedious, I know)
2. Try pg_autovacuum in the contrib/ directory

The autovacuum utility monitors activity for you and targets tables when
they've seen a certain amount of activity. Even if it hasn't got the
tunability you need, it should be a simple patch to add a list of
"excluded" tables.

--
Richard Huxton
Archonet Ltd

#4Ragnar Hafstað
gnari@simnet.is
In reply to: Mark Dexter (#3)
Re: VACUUM and ANALYZE With Empty Tables

From: "Mark Dexter" <MDEXTER@dexterchaney.com>

Thanks very much for the information. It would appear that our best

option might be to vacuum analyze these tables in our

application at a point in time when they contain rows instead of doing it

at night. Needlesst to say, it would nice to have > an option to analyze
with a target number of rows instead of the number presently in the table.

I suppose another option would be to keep a small number of rows

permanently in these tables. In my testing, 100 rows (94 to > be exact) did
the trick. Is this number going to vary from table to table?

or, you could add this procedure to your nightly vacuum job:
(after regular vacuum analyzes)
insert a representative dummy row set into the empty table
analyze the table
remove the rows again

this way the dummy rows wont interfere with your regular operations.
if the table is not always empty at vacuum time, you need to be
able to differentiate the dummy rows from the regular ones to be able
to remove only the dummy ones, of course.

gnari