Completely wrong row estimates
Subject: Completely wrong row estimates
Hello everybody,
Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:
EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 ms
Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.
Is this a bug in postgres?
--
mvh Björn
On 04/05/10 01:44, Björn Lindqvist wrote:
Subject: Completely wrong row estimates
Hello everybody,
Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 msNote how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.Is this a bug in postgres?
--
mvh Björn
It seems like regular maintainance tasks haven't been carried out.
VACUUM, etc.
--
Nilesh Govindarajan
Site & Server Administrator
www.itech7.com
मेरा भारत महान !
मम भारत: महत्तम भवतु !
2010/4/4 Björn Lindqvist <bjourne@gmail.com>:
Subject: Completely wrong row estimates
Hello everybody,
Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 msNote how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.Is this a bug in postgres?
--
mvh Björn
You probably want to run "analyze" or "vacuum analyze" to update
statistics. Do you have auto vacuum setup?
--
Rob Wultsch
wultsch@gmail.com
2010/4/4 Björn Lindqvist <bjourne@gmail.com>:
Subject: Completely wrong row estimates
Hello everybody,
Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 msNote how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.
I assume you mean default_statistics_target, not enable_statistics_target.
You should try setting it higher - but obviously just for these
columns. Use something like
ALTER TABLE words ALTER COLUMN word SET STATISTICS 1000
Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum
full, just analyze.
Oh, and if what you're doing is actually full text search, which is
what it looks like, you should really look at using the native full
text indexing support rather than just stuffing your words in a table.
You'll get better and much faster results.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>:
Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.I assume you mean default_statistics_target, not enable_statistics_target.
Yes, sorry.
You should try setting it higher - but obviously just for these
columns. Use something likeALTER TABLE words ALTER COLUMN word SET STATISTICS 1000
Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum
full, just analyze.
Done that and it doesn't help. The estimates are always off for the
query of the type I specified.
Oh, and if what you're doing is actually full text search, which is
what it looks like, you should really look at using the native full
text indexing support rather than just stuffing your words in a table.
You'll get better and much faster results.
It is more "full tag search" because I'm not using any word stemming,
phrase matching or OR:ing query terms. It was, when I measured it,
significantly faster than using the native text searching feature.
--
mvh Björn
2010/4/6 Björn Lindqvist <bjourne@gmail.com>
Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>:
Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.
Have you tried running :
'EXPLAIN ANALYZE <your query>'
?
This will show you the estimates and the actuals (for each operation) side
by side.
--Scott
Show quoted text
I assume you mean default_statistics_target, not
enable_statistics_target.
Yes, sorry.
You should try setting it higher - but obviously just for these
columns. Use something likeALTER TABLE words ALTER COLUMN word SET STATISTICS 1000
Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum
full, just analyze.Done that and it doesn't help. The estimates are always off for the
query of the type I specified.Oh, and if what you're doing is actually full text search, which is
what it looks like, you should really look at using the native full
text indexing support rather than just stuffing your words in a table.
You'll get better and much faster results.It is more "full tag search" because I'm not using any word stemming,
phrase matching or OR:ing query terms. It was, when I measured it,
significantly faster than using the native text searching feature.--
mvh Björn--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Den 6 april 2010 14.22 skrev Scott Mead <scott.lists@enterprisedb.com>:
2010/4/6 Björn Lindqvist <bjourne@gmail.com>
Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>:
Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.Have you tried running :
'EXPLAIN ANALYZE <your query>'
?
This will show you the estimates and the actuals (for each operation) side
by side.
Yes, see my first message where I post the EXPLAIN ANALYZE output for the query.
--
mvh Björn