updates (postgreSQL) very slow
Hi,
Can someone please help me. My PostgreSQL queries
are very slow, especially update statements. What
can I do to improve the speed? I have already try
VACUUM and ANALYZE. " From the command line I have
vacuumdb -z -a -f
from inside psql:
database=# VACUUM FULL ANALYZE;"
I work on a Linux (Mandake 9.1) computer with 2 Zeon
Prosessors and 4Gig of ram. The HD on which
postgreSQL is running is a 80Gig drive and read
55Mbit/Sec. The next query is an example. The table
in which I work here contains 747 524 records and 14
fields (columns).
bons_acc=# explain update edc_ww set edc=null;
QUERY PLAN
-------------------------------------
---------------------------
Seq Scan on edc_ww
(cost=0.00..156793.91 rows=3491 width=184)
(1 row)
Time: 0.61 ms
bons_acc=# update edc_ww set edc=null;
UPDATE 747524
Time: 7628686.23 ms
This is just a Seq Scan where a numeric field must be updated to
NULL but if I run it you can see that this simple query takes
forever (7628686.23 ms this is over 2 hours for only updating
747524 records!). I dont think that the tables are to big? Could it
be my hardware/software/postgreSQL? What can I do to
optimise postgreSQL? I already increased the shared buffer in
the conf. file aswell.
Bobbie______________________________________________
____________
Bobbie van der Westhuizen
Quantitative Animal Breeding (BLUP)
ARC - Animal Improvement Institute
+27-12-672-9128 (o/h)
+27-12-665-1419 (fax)
bobbie@irene.agric.za
____________________________________________________
______
Can someone please help me. My PostgreSQL queries
are very slow, especially update statements. What
can I do to improve the speed? I have already try
VACUUM and ANALYZE. " From the command line I have
vacuumdb -z -a -f
from inside psql:
database=# VACUUM FULL ANALYZE;"
Good start to taking performance measurements - the planner's statistics
are updated you will be able to get accurate analyses.
I work on a Linux (Mandake 9.1) computer with 2 Zeon
Prosessors and 4Gig of ram. The HD on which
postgreSQL is running is a 80Gig drive and read
55Mbit/Sec. The next query is an example. The table
in which I work here contains 747 524 records and 14
fields (columns).
So you have some decent sized hardware here but you may want a second
drive for backups or housing the logfiles. Also your database is not
exceptionally large - PostgreSQL can handle much larger.
bons_acc=# explain update edc_ww set edc=null;
Seq Scan on edc_ww
(cost=0.00..156793.91 rows=3491 width=184)
(1 row)
Time: 0.61 msbons_acc=# update edc_ww set edc=null;
UPDATE 747524
Time: 7628686.23 msThis is just a Seq Scan where a numeric field must be updated to
NULL but if I run it you can see that this �simple� query takes
forever (7628686.23 ms this is over 2 hours for only updating
747524 records!). I don�t think that the tables are to big? Could it
be my hardware/software/postgreSQL? What can I do to
optimise postgreSQL? I already increased the shared buffer in
the conf. file aswell.
For improving the performance of large updates such as this query you will
need to adjust some other parameters of postgresql.conf such as increasing
the number of checkpoint segments and setting logging to a minimum level.
Or to accomplish this particular update quickly, you can drop the edc
column, re-add it and set the default to null.
However updating every row to null with 700k rows is going to take a while
and this query is probably not a good test case to judge your database
performance. Try testing some of your other queries. Post the EXPLAIN
ANALYZE results of those queries to the psql-performance@postgresql.org
list along with a complete copy of your postgresql.conf file. I think you
are not getting a good measurement of your actual database performance by
judging it with this simple test case where every row is updated.
Regards,
Fred
On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
"Fred Moyer" <fred@redhotpenguin.com> writes:
This is just a Seq Scan where a numeric field must be updated to
NULL but if I run it you can see that this simple query takes
forever (7628686.23 ms this is over 2 hours for only updating
747524 records!).However updating every row to null with 700k rows is going to take a while
A while, sure, but 2 hours seems excessive to me too. I'm betting that
there are triggers or foreign keys on the table being updated, and that
that's where the time is going. It might be possible to improve that,
but Bobbie hasn't given us enough information.
If there are no foreign keys or triggers and updating each row is taking
one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
the time of 7628686 ms for the update above. Is this is an accurate
estimate or are these numbers just coincidence? It seems like this could
represent the least efficient update scenario.
Import Notes
Reply to msg id not found: 6781.1078932633@sss.pgh.pa.us
"Fred Moyer" <fred@redhotpenguin.com> writes:
This is just a Seq Scan where a numeric field must be updated to
NULL but if I run it you can see that this �simple� query takes
forever (7628686.23 ms this is over 2 hours for only updating
747524 records!).
However updating every row to null with 700k rows is going to take a while
A while, sure, but 2 hours seems excessive to me too. I'm betting that
there are triggers or foreign keys on the table being updated, and that
that's where the time is going. It might be possible to improve that,
but Bobbie hasn't given us enough information.
Another thing that jumps out at me is that this table hasn't been
vacuumed or analyzed recently. The planner thinks there are 3491 rows
when really there are 747524. That's a bit of a big difference. It
won't matter for the UPDATE itself --- a seqscan is a seqscan --- but
it might matter for planning foreign-key queries.
regards, tom lane
Have you run this update query again and again with vacuuming?
http://www.postgresql.org/docs/7.4/static/maintenance.html#ROUTINE-VACUUMING
If so, you might have millions and millions of dead tuples taking up
space and slowing things down. If you're running 7.4, install the
autovacuum daemon and turn it on. Nice little program that should mostly
take care of this issue for you.
Got any foreign keys on that field? Triggers?
Fred Moyer <fred@redhotpenguin.com> writes:
On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
A while, sure, but 2 hours seems excessive to me too.
If there are no foreign keys or triggers and updating each row is taking
one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
the time of 7628686 ms for the update above. Is this is an accurate
estimate or are these numbers just coincidence?
Probably coincidence. There's no reason to think that a large UPDATE
would expend one disk seek per updated row on average --- there's enough
buffering between the UPDATE and the drive heads that under normal
circumstances the cost should be lots less.
If I had to bet at this point I'd bet on inefficient foreign-key checks,
but since we haven't seen any schema details that's purely speculation.
regards, tom lane
On 11 Mar 2004 at 2:01, Tom Lane wrote:
Fred Moyer <fred@redhotpenguin.com> writes:
On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
A while, sure, but 2 hours seems excessive to me too.
If there are no foreign keys or triggers and updating each row is taking
one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
the time of 7628686 ms for the update above. Is this is an accurate
estimate or are these numbers just coincidence?
Probably coincidence. There's no reason to think that a large UPDATE
would expend one disk seek per updated row on average --- there's
enough
buffering between the UPDATE and the drive heads that under normal
circumstances the cost should be lots less.
If I had to bet at this point I'd bet on inefficient foreign-key checks,
but since we haven't seen any schema details that's purely
speculation.
regards, tom lane
There are no foreign-keys in this table. What schema details do you
need, then I can give it to you. I am a new user of postgreSQL so I am
not clude-up with all of the stuff.
________________________________________________________
__
Bobbie van der Westhuizen
Quantitative Animal Breeding (BLUP)
ARC - Animal Improvement Institute
+27-12-672-9128 (o/h)
+27-12-665-1419 (fax)
bobbie@irene.agric.za
________________________________________________________
__
Hello,
Is this a place where increasing default_statistics_target will help?
Sincerely,
J
Bobbie van der Westhuizen wrote:
Show quoted text
On 11 Mar 2004 at 2:01, Tom Lane wrote:
Fred Moyer <fred@redhotpenguin.com> writes:
On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
A while, sure, but 2 hours seems excessive to me too.
If there are no foreign keys or triggers and updating each row is taking
one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
the time of 7628686 ms for the update above. Is this is an accurate
estimate or are these numbers just coincidence?Probably coincidence. There's no reason to think that a large UPDATE
would expend one disk seek per updated row on average --- there's
enough
buffering between the UPDATE and the drive heads that under normal
circumstances the cost should be lots less.If I had to bet at this point I'd bet on inefficient foreign-key checks,
but since we haven't seen any schema details that's purely
speculation.regards, tom lane
There are no foreign-keys in this table. What schema details do you
need, then I can give it to you. I am a new user of postgreSQL so I am
not clude-up with all of the stuff.
________________________________________________________
__
Bobbie van der Westhuizen
Quantitative Animal Breeding (BLUP)
ARC - Animal Improvement Institute
+27-12-672-9128 (o/h)
+27-12-665-1419 (fax)
bobbie@irene.agric.za
________________________________________________________
__---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
"Bobbie van der Westhuizen" <Bobbie@idpi1.agric.za> writes:
If I had to bet at this point I'd bet on inefficient foreign-key checks,
but since we haven't seen any schema details that's purely
speculation.
There are no foreign-keys in this table. What schema details do you
need, then I can give it to you. I am a new user of postgreSQL so I am
not clude-up with all of the stuff.
"pg_dump -s" is the easiest way of exhibiting the schema of a database.
regards, tom lane