Partitioning and constraint exclusion

Started by Jayadevan Mover 10 years ago6 messagesgeneral
Jump to latest
#1Jayadevan M
maymala.jayadevan@gmail.com

Hello ,

I have a parent table and 6 child tables (partitions). The child tables
have check constraints defined in the form
CHECK (myuid in (123,456,..)).
myuid is bigint, the constraints for the 6 child tables are definitely
mutually exclusive. The number of values in the list ranges from 2-10 for 5
of the child tables. For the 6th child table, the list is 2500+ elements.
When I try explain/explain analyze for even a simple query like

select * from parent where myuid in (123,456,789)

the child table with 2500+ elements gets always scanned. I have an index on
the column and that does get used. But why doesn't the planner just use
constraint exclusion and not go for the index scan? Anyone faced a similar
issue?

Thanks,
Jayadevan

#2Melvin Davidson
melvin6925@gmail.com
In reply to: Jayadevan M (#1)
Re: Partitioning and constraint exclusion

First, what is the PostgresSQL version ??????

Next, in postgresql.conf, what is the value of constraint_exclusion ?

On Mon, Sep 7, 2015 at 8:55 AM, Jayadevan M <maymala.jayadevan@gmail.com>
wrote:

Hello ,

I have a parent table and 6 child tables (partitions). The child tables
have check constraints defined in the form
CHECK (myuid in (123,456,..)).
myuid is bigint, the constraints for the 6 child tables are definitely
mutually exclusive. The number of values in the list ranges from 2-10 for 5
of the child tables. For the 6th child table, the list is 2500+ elements.
When I try explain/explain analyze for even a simple query like

select * from parent where myuid in (123,456,789)

the child table with 2500+ elements gets always scanned. I have an index
on the column and that does get used. But why doesn't the planner just use
constraint exclusion and not go for the index scan? Anyone faced a similar
issue?

Thanks,
Jayadevan

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Jayadevan M (#1)
Re: Partitioning and constraint exclusion

On Mon, Sep 7, 2015 at 8:55 AM, Jayadevan M <maymala.jayadevan@gmail.com>
wrote:

Hello ,

I have a parent table and 6 child tables (partitions). The child tables
have check constraints defined in the form
CHECK (myuid in (123,456,..)).
myuid is bigint, the constraints for the 6 child tables are definitely
mutually exclusive. The number of values in the list ranges from 2-10 for 5
of the child tables. For the 6th child table, the list is 2500+ elements.
When I try explain/explain analyze for even a simple query like

select * from parent where myuid in (123,456,789)

the child table with 2500+ elements gets always scanned. I have an index
on the column and that does get used. But why doesn't the planner just use
constraint exclusion and not go for the index scan? Anyone faced a similar
issue?

IIRC ​The planner doesn't understand​

​overlaps so having a definition of:

IN (1,2,3,4,5); or nearly equivalently = ANY(ARRAY[1,2,3,4,5]))​

and a request for:

IN (1,3,5) / = ANY(ARRAY[1,3,5]) is going to get you nowhere with the
planner.

​I am not sure but am doubting it is intelligent enough to recognize the
functional expression even if all of the values are present. "simple
equality" (
http://www.postgresql.org/docs/9.4/interactive/ddl-partitioning.html) this
is not.

David J.

#4Vick Khera
vivek@khera.org
In reply to: David G. Johnston (#3)
Re: Partitioning and constraint exclusion

On Mon, Sep 7, 2015 at 4:48 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

IIRC ​The planner doesn't understand​

​overlaps so having a definition of:

IN (1,2,3,4,5); or nearly equivalently = ANY(ARRAY[1,2,3,4,5]))​

and a request for:

IN (1,3,5) / = ANY(ARRAY[1,3,5]) is going to get you nowhere with the
planner.

The partition code does not execute your CHECK condition; it only tests to
see if the query includes it explicitly. For example, if you have a split
on "id % 100 = 59" for a table, then looking for id = 13059 does you no
good; you have to search for "id = 13059 AND id % 100 = 59" to invoke the
table exclusions.

#5Jayadevan M
maymala.jayadevan@gmail.com
In reply to: Melvin Davidson (#2)
Re: Partitioning and constraint exclusion

On Mon, Sep 7, 2015 at 7:12 PM, Melvin Davidson <melvin6925@gmail.com>
wrote:

First, what is the PostgresSQL version ??????

9.3.6

Next, in postgresql.conf, what is the value of constraint_exclusion ?

partition

In response to the other possible issues pointed out - the planner is
indeed *skipping the rest of the child tables* (the ones with just a few
values in the check constraint). That is why I feel the number of elements
in the check constraint on this particular child table is causing it to be
scanned. The query ends up scanning the table where the data will be found
and the table with 2500+ values in the check constraint. I may be missing
something?
I tried changing the filter from myuid in (1,2,3) to myuid = 1 or myuid = 2
or....
It did not improve the plan. One Index Cond became 3 Index Cond .

Thanks,
Jayadevan

#6Jayadevan M
maymala.jayadevan@gmail.com
In reply to: David G. Johnston (#3)
Re: Partitioning and constraint exclusion

​I am not sure but am doubting it is intelligent enough to recognize the
functional expression even if all of the values are present. "simple
equality" (
http://www.postgresql.org/docs/9.4/interactive/ddl-partitioning.html)
this is not.

Looks like the tables with about 100+ values in the check list gets pulled
in, even with constraint exclusion on. I created a simple test case. One
parent table with just one column, and 3 child tables with one column.
test=# \d+ parent
Table "public.parent"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
id | integer | | plain | |
Child tables: child1,
child2,
child3

test=# \d+ child1
Table "public.child1"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
id | integer | | plain | |
Check constraints:
"c" CHECK (id = ANY (ARRAY[1, 2]))
Inherits: parent

test=# \d+ child2
Table "public.child2"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
id | integer | | plain | |
Check constraints:
"c" CHECK (id = ANY (ARRAY[3, 4]))
Inherits: parent

test=# \d+ child3
Table "public.child3"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
id | integer | | plain | |
Check constraints:
"c3" CHECK (id = ANY (ARRAY[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,
55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73,
74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
93, 94, 95, 96, 97, 98, 99, 100]))
Inherits: parent

test=# explain analyze select * from parent where id = 1;
QUERY PLAN

--------------------------------------------------------------------------------------------------------
Append (cost=0.00..40.00 rows=13 width=4) (actual time=0.002..0.002
rows=0 loops=1)
-> Seq Scan on parent (cost=0.00..0.00 rows=1 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
Filter: (id = 1)
* -> Seq Scan on child1 (cost=0.00..40.00 rows=12 width=4) (actual
time=0.000..0.000 rows=0 loops=1)*
* Filter: (id = 1)*
Total runtime: 0.029 ms

If I increase the number of values a bit more.....
with t as (select generate_series(*5,110*) x ) select 'alter table child3
add constraint c3 check ( id in ( ' || string_agg(x::text,',') || ' ))
; ' from t;

test=# explain analyze select * from parent where id = 1;
QUERY PLAN

--------------------------------------------------------------------------------------------------------
Append (cost=0.00..80.00 rows=25 width=4) (actual time=0.003..0.003
rows=0 loops=1)
-> Seq Scan on parent (cost=0.00..0.00 rows=1 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
Filter: (id = 1)
-> Seq Scan on child1 (cost=0.00..40.00 rows=12 width=4) (actual
time=0.000..0.000 rows=0 loops=1)
Filter: (id = 1)
* -> Seq Scan on child3 (cost=0.00..40.00 rows=12 width=4) (actual
time=0.000..0.000 rows=0 loops=1)*
* Filter: (id = 1)*

Thanks,
Jayadevan

David J.

Show quoted text