7.4.7: strange planner decision

Started by Roman Neuhauserover 20 years ago11 messagesgeneral
Jump to latest
#1Roman Neuhauser
neuhauser@sigpipe.cz

Why does the planner want to crawl the table that has 5M rows instead of the one
with 176k rows? Both tables are freshly vacuum-full-analyzed.

7.4.7 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)

callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=5340.00..292675.06 rows=176161 width=44)
Hash Cond: (("outer".base)::text = ("inner".base)::text)
-> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 width=41)
-> Hash (cost=3436.60..3436.60 rows=176160 width=44)
-> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 width=44)
(5 rows)

callrec32=# \d fix.files
Table "fix.files"
Column | Type | Modifiers
--------+------------------------+-----------
dir | character varying(255) |
base | character varying(255) |
Indexes:
"base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
"ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))

callrec32=# \d fix.dups
Table "fix.dups"
Column | Type | Modifiers
--------+------------------------+-----------
base | character varying(255) |
Indexes:
"dups_base_key" unique, btree (base)

callrec32=# select count(*) from fix.files; select count(*) from fix.dups;
count
---------
5278458
(1 row)

count
--------
176160
(1 row)

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man. You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

#2Richard Huxton
dev@archonet.com
In reply to: Roman Neuhauser (#1)
Re: 7.4.7: strange planner decision

Roman Neuhauser wrote:

Why does the planner want to crawl the table that has 5M rows instead of the one
with 176k rows? Both tables are freshly vacuum-full-analyzed.

Because you don't have an index on "base" for the files table.

callrec32=# \d fix.files
Table "fix.files"
Column | Type | Modifiers
--------+------------------------+-----------
dir | character varying(255) |
base | character varying(255) |
Indexes:
"base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
"ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))

A couple of indexes, but none simple on "base", so it can't be used for
the join.

--
Richard Huxton
Archonet Ltd

#3David Pratt
fairwinds@eastlink.ca
In reply to: Roman Neuhauser (#1)
Array as parameter for plpgsql function

How does one pass an array as a parameter to a plpgsql function? I
have tried this the following. I can't seem to get a select statement
to work without syntax problems and no examples in Postgres book to
help with this :( This is just a test so please ignore the fact it is
a simple function.

CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS '

DECLARE

test_array ALIAS FOR $1; -- alias for input array

BEGIN

return array_upper(test_array,1)

END;
' LANGUAGE 'plpgsql';

SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test
two']]) AS output;

#4Roman Neuhauser
neuhauser@sigpipe.cz
In reply to: Richard Huxton (#2)
Re: 7.4.7: strange planner decision

# dev@archonet.com / 2005-07-13 12:57:31 +0100:

Roman Neuhauser wrote:

Why does the planner want to crawl the table that has 5M rows instead of
the one
with 176k rows? Both tables are freshly vacuum-full-analyzed.

Because you don't have an index on "base" for the files table.

I added one, ran vacuum full analyze fix.files, and:

callrec32=# \d fix.files
Table "fix.files"
Column | Type | Modifiers
--------+------------------------+-----------
dir | character varying(255) |
base | character varying(255) |
Indexes:
"base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
"ff_baseonly_idx" btree (base)
"ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))

callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=5340.00..292675.06 rows=176161 width=44)
Hash Cond: (("outer".base)::text = ("inner".base)::text)
-> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 width=41)
-> Hash (cost=3436.60..3436.60 rows=176160 width=44)
-> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 width=44)
(5 rows)

Which is exactly what I expected. Using left prefix of a multicolumn
index normally works just fine, thank you.

http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html:

The query planner can use a multicolumn index for queries that involve
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the leftmost column in the index definition plus any number of columns
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
listed to the right of it, without a gap. For example, an index on (a,
b, c) can be used in queries involving all of a, b, and c, or in queries
involving both a and b, or in queries involving only a

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man. You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

#5Richard Huxton
dev@archonet.com
In reply to: Roman Neuhauser (#4)
Re: 7.4.7: strange planner decision

Roman Neuhauser wrote:

Because you don't have an index on "base" for the files table.

I added one, ran vacuum full analyze fix.files, and:

callrec32=# \d fix.files
Table "fix.files"
Column | Type | Modifiers
--------+------------------------+-----------
dir | character varying(255) |
base | character varying(255) |
Indexes:
"base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
"ff_baseonly_idx" btree (base)
"ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))

callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=5340.00..292675.06 rows=176161 width=44)
Hash Cond: (("outer".base)::text = ("inner".base)::text)
-> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 width=41)
-> Hash (cost=3436.60..3436.60 rows=176160 width=44)
-> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 width=44)
(5 rows)

Which is exactly what I expected. Using left prefix of a multicolumn
index normally works just fine, thank you.

Couldn't figure out what you meant here - had to go back and re-read
your index definitions. Sorry - missed the (base, ...) on the front of
base_storename_idx.

What happens to the plan if you SET enable_seqscan=false; first? It's
presumably getting the row-estimate right, so unless there's terrible
correlation on "base" in the files table I can only assume it's getting
the cost estimates horribly wrong.

--
Richard Huxton
Archonet Ltd

#6Roman Neuhauser
neuhauser@sigpipe.cz
In reply to: Richard Huxton (#5)
Re: 7.4.7: strange planner decision

# dev@archonet.com / 2005-07-13 14:09:34 +0100:

Roman Neuhauser wrote:

callrec32=# \d fix.files
Table "fix.files"
Column | Type | Modifiers
--------+------------------------+-----------
dir | character varying(255) |
base | character varying(255) |
Indexes:
"base_storename_idx" btree (base, ((((dir)::text || '/'::text) ||
(base)::text)))
"ff_baseonly_idx" btree (base)
"ff_storename_idx" btree (((((dir)::text || '/'::text) ||
(base)::text)))

callrec32=# explain select fd.base from fix.dups fd join fix.files ff
using (base);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=5340.00..292675.06 rows=176161 width=44)
Hash Cond: (("outer".base)::text = ("inner".base)::text)
-> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458
width=41)
-> Hash (cost=3436.60..3436.60 rows=176160 width=44)
-> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160
width=44)
(5 rows)

What happens to the plan if you SET enable_seqscan=false; first? It's
presumably getting the row-estimate right, so unless there's terrible
correlation on "base" in the files table I can only assume it's getting
the cost estimates horribly wrong.

callrec32=# SET enable_seqscan=false;
SET
callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
QUERY PLAN
--------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..1066990.93 rows=176161 width=44)
-> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44)
-> Index Scan using ff_baseonly_idx on files ff (cost=0.00..6.01 rows=1 width=41)
Index Cond: (("outer".base)::text = (ff.base)::text)
(4 rows)

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man. You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Pratt (#3)
Re: Array as parameter for plpgsql function

David Pratt <fairwinds@eastlink.ca> writes:

CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS '
DECLARE
test_array ALIAS FOR $1; -- alias for input array
BEGIN
return array_upper(test_array,1)
END;
' LANGUAGE 'plpgsql';

SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test
two']]) AS output;

Works fine for me, once I add the semicolon you forgot:

return array_upper(test_array,1);

regards, tom lane

#8Richard Huxton
dev@archonet.com
In reply to: Roman Neuhauser (#6)
Re: 7.4.7: strange planner decision

Roman Neuhauser wrote:

# dev@archonet.com / 2005-07-13 14:09:34 +0100:

Roman Neuhauser wrote:

callrec32=# \d fix.files
Table "fix.files"
Column | Type | Modifiers
--------+------------------------+-----------
dir | character varying(255) |
base | character varying(255) |
Indexes:
"base_storename_idx" btree (base, ((((dir)::text || '/'::text) ||
(base)::text)))
"ff_baseonly_idx" btree (base)
"ff_storename_idx" btree (((((dir)::text || '/'::text) ||
(base)::text)))

callrec32=# explain select fd.base from fix.dups fd join fix.files ff
using (base);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=5340.00..292675.06 rows=176161 width=44)
Hash Cond: (("outer".base)::text = ("inner".base)::text)
-> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458
width=41)
-> Hash (cost=3436.60..3436.60 rows=176160 width=44)
-> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160
width=44)
(5 rows)

What happens to the plan if you SET enable_seqscan=false; first? It's
presumably getting the row-estimate right, so unless there's terrible
correlation on "base" in the files table I can only assume it's getting
the cost estimates horribly wrong.

callrec32=# SET enable_seqscan=false;
SET
callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
QUERY PLAN
--------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..1066990.93 rows=176161 width=44)
-> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44)
-> Index Scan using ff_baseonly_idx on files ff (cost=0.00..6.01 rows=1 width=41)
Index Cond: (("outer".base)::text = (ff.base)::text)

OK - so it thinks the cost of this plan will be about 1 million, whereas
the old plan was 290 thousand. The question is - why?
What are your planner settings? Ch 16.4.4.2 here
http://www.postgresql.org/docs/7.4/static/runtime-config.html#RUNTIME-CONFIG-QUERY

I'm guessing something to do with cpu_index_tuple_cost or random_page_cost.
--
Richard Huxton
Archonet Ltd

#9Roman Neuhauser
neuhauser@sigpipe.cz
In reply to: Roman Neuhauser (#6)
Re: 7.4.7: strange planner decision

# neuhauser@sigpipe.cz / 2005-07-13 15:58:09 +0200:

# dev@archonet.com / 2005-07-13 14:09:34 +0100:

Roman Neuhauser wrote:

callrec32=# \d fix.files
Table "fix.files"
Column | Type | Modifiers
--------+------------------------+-----------
dir | character varying(255) |
base | character varying(255) |
Indexes:
"base_storename_idx" btree (base, ((((dir)::text || '/'::text) ||
(base)::text)))
"ff_baseonly_idx" btree (base)
"ff_storename_idx" btree (((((dir)::text || '/'::text) ||
(base)::text)))

callrec32=# explain select fd.base from fix.dups fd join fix.files ff
using (base);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=5340.00..292675.06 rows=176161 width=44)
Hash Cond: (("outer".base)::text = ("inner".base)::text)
-> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458
width=41)
-> Hash (cost=3436.60..3436.60 rows=176160 width=44)
-> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160
width=44)
(5 rows)

What happens to the plan if you SET enable_seqscan=false; first? It's
presumably getting the row-estimate right, so unless there's terrible
correlation on "base" in the files table I can only assume it's getting
the cost estimates horribly wrong.

callrec32=# SET enable_seqscan=false;
SET
callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
QUERY PLAN
--------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..1066990.93 rows=176161 width=44)
-> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44)
-> Index Scan using ff_baseonly_idx on files ff (cost=0.00..6.01 rows=1 width=41)
Index Cond: (("outer".base)::text = (ff.base)::text)
(4 rows)

BTW, this query or its equivalent is what I'm really after:

callrec32=# explain select c1.storename from fix.dups fd join calls
c1 on (fd.base = basename(c1.storename) and c1.iscouple = '1') where
not exists (select 1 from fix.files ff where c1.storename = ff.dir
|| '/' || ff.base);
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Hash Join (cost=7474.26..23127970.91 rows=2354719 width=60)
Hash Cond: ((basename("outer".storename))::text = ("inner".base)::text)
-> Index Scan using calls2_iscouple_idx on calls c1 (cost=0.00..22982439.69 rows=2354719 width=60)
Filter: ((iscouple = 1::smallint) AND (NOT (subplan)))
SubPlan
-> Index Scan using ff_storename_idx on files ff (cost=0.00..88570.16 rows=26393 width=0)
Index Cond: (($0)::text = (((dir)::text || '/'::text) || (base)::text))
-> Hash (cost=5570.86..5570.86 rows=176160 width=44)
-> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44)
(9 rows)

calls has (among others):

"calls2_basename_storename_idx" btree (basename(storename), storename) WHERE (iscouple = (1)::smallint)
"calls2_storename_idx" btree (storename) WHERE (iscouple = (1)::smallint)
"calls2_iscouple_idx" btree (id) WHERE (iscouple = (1)::smallint)

WHy does it use the calls2_iscouple_idx index when calls.id isn't
used anywhere in the query? I would guess that calls2_storename_idx
would actually be more useful.

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man. You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Huxton (#5)
Re: 7.4.7: strange planner decision

Richard Huxton <dev@archonet.com> writes:

What happens to the plan if you SET enable_seqscan=false; first? It's
presumably getting the row-estimate right, so unless there's terrible
correlation on "base" in the files table I can only assume it's getting
the cost estimates horribly wrong.

I think you'll find that the results suck ;-). It looks to me that the
planner is making exactly the right choice here. The only plausible
alternative is a nestloop with inner indexscan on "files", which would
imply 176160 separate index probes into "files", which is unlikely to
win compared to one seqscan. (In the aggregate the index probes are
likely to end up touching every page of "files" anyway --- you would
need a much larger files table before this stopped being true.)

If you want to compare the nestloop plan to test this theory, turning
off enable_hashjoin and (if necessary) enable_mergejoin would be the
better way to get it. But let's see EXPLAIN ANALYZE results for both
cases, not just EXPLAIN.

regards, tom lane

#11David Pratt
fairwinds@eastlink.ca
In reply to: Tom Lane (#7)
Re: Array as parameter for plpgsql function

argh!!! It was telling me I had an error in select statement. Thanks
Tom!

Regards
David

On Wednesday, July 13, 2005, at 11:08 AM, Tom Lane wrote:

Show quoted text

David Pratt <fairwinds@eastlink.ca> writes:

CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS '
DECLARE
test_array ALIAS FOR $1; -- alias for input array
BEGIN
return array_upper(test_array,1)
END;
' LANGUAGE 'plpgsql';

SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test
two']]) AS output;

Works fine for me, once I add the semicolon you forgot:

return array_upper(test_array,1);

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that
your
message can get through to the mailing list cleanly