Completely wrong row estimates

Started by Björn Lindqvistabout 16 years ago7 messagesgeneral

Jump to latest

Björn Lindqvist

bjourne@gmail.com

about 16 years ago

Subject: Completely wrong row estimates

Hello everybody,

Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:

EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 ms

Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.

Is this a bug in postgres?

--
mvh Björn

Nilesh Govindarajan

lists@itech7.com

about 16 years ago

In reply to: Björn Lindqvist (#1)

Re: Completely wrong row estimates

On 04/05/10 01:44, Björn Lindqvist wrote:

Subject: Completely wrong row estimates

Hello everybody,

Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:

EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 ms

Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.

Is this a bug in postgres?

--
mvh Björn

It seems like regular maintainance tasks haven't been carried out.
VACUUM, etc.

--
Nilesh Govindarajan
Site & Server Administrator
www.itech7.com
मेरा भारत महान !
मम भारत: महत्तम भवतु !

Rob Wultsch

wultsch@gmail.com

about 16 years ago

In reply to: Björn Lindqvist (#1)

Re: Completely wrong row estimates

2010/4/4 Björn Lindqvist <bjourne@gmail.com>:

Subject: Completely wrong row estimates

Hello everybody,

Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:

EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 ms

Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.

Is this a bug in postgres?

--
mvh Björn

You probably want to run "analyze" or "vacuum analyze" to update
statistics. Do you have auto vacuum setup?

--
Rob Wultsch
wultsch@gmail.com

Magnus Hagander

magnus@hagander.net

about 16 years ago

In reply to: Björn Lindqvist (#1)

Re: Completely wrong row estimates

2010/4/4 Björn Lindqvist <bjourne@gmail.com>:

Subject: Completely wrong row estimates

Hello everybody,

Here is the EXPLAIN ANALYZE output for a simple query in my database
running on postgres 8.3.9:

EXPLAIN ANALYZE
SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id
WHERE w.word = 'tagtext';

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual
time=19.266..131.255 rows=43374 loops=1)
-> Index Scan using word_word_key on word w (cost=0.00..8.28
rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1)
Index Cond: ((word)::text = 'tagtext'::text)
-> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31
rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1)
Recheck Cond: (vw.word_id = w.id)
-> Bitmap Index Scan on video_words_word_id_key
(cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662
rows=43374 loops=1)
Index Cond: (vw.word_id = w.id)
Total runtime: 154.215 ms

Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.

I assume you mean default_statistics_target, not enable_statistics_target.

You should try setting it higher - but obviously just for these
columns. Use something like

ALTER TABLE words ALTER COLUMN word SET STATISTICS 1000

Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum
full, just analyze.

Oh, and if what you're doing is actually full text search, which is
what it looks like, you should really look at using the native full
text indexing support rather than just stuffing your words in a table.
You'll get better and much faster results.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Björn Lindqvist

bjourne@gmail.com

about 16 years ago

In reply to: Magnus Hagander (#4)

Re: Completely wrong row estimates

Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>:

Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.

I assume you mean default_statistics_target, not enable_statistics_target.

Yes, sorry.

You should try setting it higher - but obviously just for these
columns. Use something like

ALTER TABLE words ALTER COLUMN word SET STATISTICS 1000

Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum
full, just analyze.

Done that and it doesn't help. The estimates are always off for the
query of the type I specified.

Oh, and if what you're doing is actually full text search, which is
what it looks like, you should really look at using the native full
text indexing support rather than just stuffing your words in a table.
You'll get better and much faster results.

It is more "full tag search" because I'm not using any word stemming,
phrase matching or OR:ing query terms. It was, when I measured it,
significantly faster than using the native text searching feature.

--
mvh Björn

Scott Mead

scott.lists@enterprisedb.com

about 16 years ago

In reply to: Björn Lindqvist (#5)

Re: Completely wrong row estimates

2010/4/6 Björn Lindqvist <bjourne@gmail.com>

Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>:

Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.

Have you tried running :

'EXPLAIN ANALYZE <your query>'

This will show you the estimates and the actuals (for each operation) side
by side.

--Scott

Show quoted text

I assume you mean default_statistics_target, not

enable_statistics_target.

Yes, sorry.

You should try setting it higher - but obviously just for these
columns. Use something like

ALTER TABLE words ALTER COLUMN word SET STATISTICS 1000

Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum
full, just analyze.

Done that and it doesn't help. The estimates are always off for the
query of the type I specified.

Oh, and if what you're doing is actually full text search, which is
what it looks like, you should really look at using the native full
text indexing support rather than just stuffing your words in a table.
You'll get better and much faster results.

It is more "full tag search" because I'm not using any word stemming,
phrase matching or OR:ing query terms. It was, when I measured it,
significantly faster than using the native text searching feature.

--
mvh Björn

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Björn Lindqvist

bjourne@gmail.com

about 16 years ago

In reply to: Scott Mead (#6)

Re: Completely wrong row estimates

Den 6 april 2010 14.22 skrev Scott Mead <scott.lists@enterprisedb.com>:

2010/4/6 Björn Lindqvist <bjourne@gmail.com>

Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>:

Note how the planner estimates that there are 766 rows in the table
that matches the word 'tagtext'. In reality 43374 does. I've tried to
get postgres to refresh the statistics by running with
enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
ANALYZE etc but nothing works. Postgres seem stuck with its bad
statistics and unwilling to change them. There are many other strings
that also matches tens of thousands of rows in the table which
postgres only thinks matches 766.

Have you tried running :
'EXPLAIN ANALYZE <your query>'
?
This will show you the estimates and the actuals (for each operation) side
by side.

Yes, see my first message where I post the EXPLAIN ANALYZE output for the query.

--
mvh Björn