Nested loop in simple query taking long time
Hello list,
Usually I can see what is wrong with queries but I can't figure out
why this query is slow.
Below is query and explain analyze output.
Any help would be appreciated.
EXPLAIN ANALYZE SELECT computer_name
FROM tbl_computer
INNER JOIN tbl_share ON pk_computer_id = tbl_share.fk_computer_id
INNER JOIN tbl_archive ON pk_share_id = tbl_archive.fk_share_id
LEFT OUTER JOIN tbl_job ON fk_job_id = pk_job_id
LEFT OUTER JOIN tbl_job_group ON fk_job_group_id = pk_job_group_id
WHERE archive_complete IS TRUE AND (job_group_type != 'R' OR
job_group_type IS NULL)
GROUP BY computer_name ORDER BY computer_name;
Group (cost=1171.27..19099.89 rows=16 width=11) (actual
time=4184.019..40929.159 rows=14 loops=1)
-> Nested Loop Left Join (cost=1171.27..19099.61 rows=112
width=11) (actual time=4184.018..40928.994 rows=192 loops=1)
Join Filter: (tbl_archive.fk_job_id = tbl_job.pk_job_id)
Filter: ((tbl_job_group.job_group_type <> 'R'::bpchar) OR
(tbl_job_group.job_group_type IS NULL))
-> Nested Loop (cost=1151.65..18960.06 rows=123 width=19)
(actual time=4080.070..40821.217 rows=192 loops=1)
-> Nested Loop (cost=1.18..43.19 rows=16 width=19)
(actual time=59.167..91.739 rows=16 loops=1)
Join Filter: (tbl_computer.pk_computer_id =
tbl_share.fk_computer_id)
-> Index Scan using computer_name on
tbl_computer (cost=0.00..36.25 rows=16 width=19) (actual
time=48.844..81.210 rows=16 loops=1)
-> Materialize (cost=1.18..1.34 rows=16
width=16) (actual time=0.645..0.650 rows=16 loops=16)
-> Seq Scan on tbl_share (cost=0.00..1.16
rows=16 width=16) (actual time=10.312..10.318 rows=16 loops=1)
-> Bitmap Heap Scan on tbl_archive
(cost=1150.47..1182.20 rows=8 width=16) (actual
time=1564.082..2545.570 rows=12 loops=16)
Recheck Cond: (tbl_share.pk_share_id =
tbl_archive.fk_share_id)
Filter: (archive_complete IS TRUE)
-> Bitmap Index Scan on tbl_archive_idx1
(cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456
rows=86053 loops=16)
Index Cond: (tbl_share.pk_share_id =
tbl_archive.fk_share_id)
-> Materialize (cost=19.62..20.01 rows=39 width=13) (actual
time=0.385..0.549 rows=39 loops=192)
-> Hash Left Join (cost=11.65..19.58 rows=39
width=13) (actual time=73.747..103.889 rows=39 loops=1)
Hash Cond: (tbl_job.fk_job_group_id =
tbl_job_group.pk_job_group_id)
-> Seq Scan on tbl_job (cost=0.00..7.39 rows=39
width=16) (actual time=54.284..84.396 rows=39 loops=1)
-> Hash (cost=11.29..11.29 rows=29 width=13)
(actual time=0.119..0.119 rows=29 loops=1)
-> Seq Scan on tbl_job_group
(cost=0.00..11.29 rows=29 width=13) (actual time=0.013..0.087 rows=29
loops=1)
Total runtime: 40929.241 ms
Cheer,
Henke
Henrik Zagerholm <henke@mac.se> writes:
Usually I can see what is wrong with queries but I can't figure out
why this query is slow.
Seems the main problem is here:
-> Bitmap Index Scan on tbl_archive_idx1
(cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456
rows=86053 loops=16)
Index Cond: (tbl_share.pk_share_id =
tbl_archive.fk_share_id)
Why is this scan finding so many more rows than the planner expects?
Perhaps increasing the stats targets on these columns would help it
make a better estimate.
regards, tom lane
5 dec 2007 kl. 16.25 skrev Tom Lane:
Henrik Zagerholm <henke@mac.se> writes:
Usually I can see what is wrong with queries but I can't figure out
why this query is slow.Seems the main problem is here:
-> Bitmap Index Scan on tbl_archive_idx1
(cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456
rows=86053 loops=16)
Index Cond: (tbl_share.pk_share_id =
tbl_archive.fk_share_id)Why is this scan finding so many more rows than the planner expects?
Perhaps increasing the stats targets on these columns would help it
make a better estimate.
This is really weird. That tables primary key sequence is at 1220
and the number of rows right now is 139. There have never been that
many rows in tbl_archive. Could the index or stat be really really
corrupt?
Thanks,
Henke
Show quoted text
regards, tom lane
---------------------------(end of
broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
Henrik Zagerholm <henke@mac.se> writes:
5 dec 2007 kl. 16.25 skrev Tom Lane:
Henrik Zagerholm <henke@mac.se> writes:
-> Bitmap Index Scan on tbl_archive_idx1
(cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456
rows=86053 loops=16)
Index Cond: (tbl_share.pk_share_id =
tbl_archive.fk_share_id)
Why is this scan finding so many more rows than the planner expects?
This is really weird. That tables primary key sequence is at 1220
and the number of rows right now is 139. There have never been that
many rows in tbl_archive. Could the index or stat be really really
corrupt?
I wonder how long it's been since you vacuumed that table? The rowcount
from the bitmap indexscan would include tuple IDs that are in the index
but prove to be dead upon arrival at the heap.
regards, tom lane
6 dec 2007 kl. 18.12 skrev Tom Lane:
Henrik Zagerholm <henke@mac.se> writes:
5 dec 2007 kl. 16.25 skrev Tom Lane:
Henrik Zagerholm <henke@mac.se> writes:
-> Bitmap Index Scan on tbl_archive_idx1
(cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456
rows=86053 loops=16)
Index Cond: (tbl_share.pk_share_id =
tbl_archive.fk_share_id)Why is this scan finding so many more rows than the planner expects?
This is really weird. That tables primary key sequence is at 1220
and the number of rows right now is 139. There have never been that
many rows in tbl_archive. Could the index or stat be really really
corrupt?I wonder how long it's been since you vacuumed that table? The
rowcount
from the bitmap indexscan would include tuple IDs that are in the
index
but prove to be dead upon arrival at the heap.
I actually have autovacuum on the whole database but I may have been a
little too restrictive on the autovacuum settings.
I did a vacuum full and a reindex and now its fine again.
I think I have a clue why its so off. We update a value in that table
about 2 - 3 million times per night and as update creates a new row it
becomes bloated pretty fast. The table hade a size of 765 MB including
indexes and after vacuum full and reindex it went down to 80kB... I
guess I need routine reindex on this table. Thank god is not big. :)
Thanks Tom!
Cheers,
Henke
Show quoted text
regards, tom lane
---------------------------(end of
broadcast)---------------------------
TIP 6: explain analyze is your friend
Henrik wrote:
I think I have a clue why its so off. We update a value in that table about
2 - 3 million times per night and as update creates a new row it becomes
bloated pretty fast. The table hade a size of 765 MB including indexes and
after vacuum full and reindex it went down to 80kB... I guess I need
routine reindex on this table. Thank god is not big. :)
I suggest you put a lone VACUUM on that table in cron, say once every 5
minutes, and you should be fine. You shouldn't need a reindex at all.
--
Alvaro Herrera http://www.PlanetPostgreSQL.org/
"Right now the sectors on the hard disk run clockwise, but I heard a rumor that
you can squeeze 0.2% more throughput by running them counterclockwise.
It's worth the effort. Recommended." (Gerry Pourwelle)
Future Enhancement?
If the column's new value can fit in the space already being used by the
existing value, just change the column value in place and leave the
record alone. Would reduce the need for vacuum in many cases.
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alvaro Herrera
Sent: Thursday, December 06, 2007 2:26 PM
To: Henrik
Cc: Tom Lane; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Nested loop in simple query taking long time
Henrik wrote:
I think I have a clue why its so off. We update a value in that table
about
2 - 3 million times per night and as update creates a new row it
becomes
bloated pretty fast. The table hade a size of 765 MB including indexes
and
after vacuum full and reindex it went down to 80kB... I guess I need
routine reindex on this table. Thank god is not big. :)
I suggest you put a lone VACUUM on that table in cron, say once every 5
minutes, and you should be fine. You shouldn't need a reindex at all.
--
Alvaro Herrera
http://www.PlanetPostgreSQL.org/
"Right now the sectors on the hard disk run clockwise, but I heard a
rumor that
you can squeeze 0.2% more throughput by running them counterclockwise.
It's worth the effort. Recommended." (Gerry Pourwelle)
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Gauthier, Dave wrote:
Future Enhancement?
If the column's new value can fit in the space already being used by the
existing value, just change the column value in place and leave the
record alone. Would reduce the need for vacuum in many cases.
That's in 8.3. Not exactly like that (because it doesn't work in
Postgres due to MVCC), but something close. Look for HOT in release
notes.
--
Alvaro Herrera Valdivia, Chile ICBM: S 39� 49' 18.1", W 73� 13' 56.4"
Syntax error: function hell() needs an argument.
Please choose what hell you want to involve.
6 dec 2007 kl. 22.18 skrev Alvaro Herrera:
Gauthier, Dave wrote:
Future Enhancement?
If the column's new value can fit in the space already being used
by the
existing value, just change the column value in place and leave the
record alone. Would reduce the need for vacuum in many cases.That's in 8.3. Not exactly like that (because it doesn't work in
Postgres due to MVCC), but something close. Look for HOT in release
notes.
I want 8.3 now! =)
I did a test last night. Table grew from 16 kb to 648 kb. The index
grew from 64 kb to 6056 MB.
A REINDEX took the index back down to 64 kb. The good thing is that
REINDEX takes about 1 sek so I can do that quite often. :)
Thanks for all the input
Henke
Show quoted text
--
Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º
13' 56.4"
Syntax error: function hell() needs an argument.
Please choose what hell you want to involve.---------------------------(end of
broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
6 dec 2007 kl. 20.26 skrev Alvaro Herrera:
Henrik wrote:
I think I have a clue why its so off. We update a value in that
table about
2 - 3 million times per night and as update creates a new row it
becomes
bloated pretty fast. The table hade a size of 765 MB including
indexes and
after vacuum full and reindex it went down to 80kB... I guess I need
routine reindex on this table. Thank god is not big. :)I suggest you put a lone VACUUM on that table in cron, say once
every 5
minutes, and you should be fine. You shouldn't need a reindex at all.
Instead of cron can't I just have really aggressive autovacuum
settings on this table?
Thanks,
Henke
Show quoted text
--
Alvaro Herrera http://www.PlanetPostgreSQL.org/
"Right now the sectors on the hard disk run clockwise, but I heard a
rumor that
you can squeeze 0.2% more throughput by running them counterclockwise.
It's worth the effort. Recommended." (Gerry Pourwelle)---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that
your
message can get through to the mailing list cleanly
Henrik wrote:
6 dec 2007 kl. 20.26 skrev Alvaro Herrera:
Henrik wrote:
I think I have a clue why its so off. We update a value in that table
about
2 - 3 million times per night and as update creates a new row it becomes
bloated pretty fast. The table hade a size of 765 MB including indexes
and
after vacuum full and reindex it went down to 80kB... I guess I need
routine reindex on this table. Thank god is not big. :)I suggest you put a lone VACUUM on that table in cron, say once every 5
minutes, and you should be fine. You shouldn't need a reindex at all.Instead of cron can't I just have really aggressive autovacuum settings on
this table?
Not on 8.2 (or earlier), because it can only process one table at a
time, so if it ever takes much longer than 5 minutes for vacuuming other
tables, this table will be neglected for at least that long -- and
probably a lot longer actually.
On 8.3 this should be somewhat less of a problem because autovacuum can
process more than one table at a time. It will still be a problem if
there are several tables that take much longer than 5 minutes, but it
should be much less severe.
--
Alvaro Herrera http://www.flickr.com/photos/alvherre/
"XML!" Exclaimed C++. "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.