Problem with update on partitioned table

Started by Alex Soloveyabout 18 years ago8 messagesgeneral
Jump to latest
#1Alex Solovey
a.solovey@gmail.com

Hello,

We have pretty big production database (running PostgreSQL 8.3.1) with
many partitioned tables. In most cases, they work well (since 8.2.1 at
least) -- constraint exclusion is able to select correct partitions.
However, there is an exception: queries on partitioned tables using
PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension.

Here is a simple test case:
------------------
CREATE TABLE bar ( bar_id INT NOT NULL PRIMARY KEY );
INSERT INTO bar VALUES ( 1 ), ( 2 ), ( 3 );

CREATE TABLE foo (
part INT NOT NULL
,foo_data INT
,bar_id INT NOT NULL REFERENCES bar( bar_id )
);

CREATE TABLE foo_1 ( CHECK ( part = 1 ) ) INHERITS ( foo );
INSERT INTO foo_1 ( part, bar_id ) VALUES ( 1, 1 ), ( 1, 3 );

CREATE TABLE foo_2 ( CHECK ( part = 2 ) ) INHERITS ( foo );
INSERT INTO foo_2 ( part, bar_id ) VALUES ( 2, 2 ), ( 2, 3 );

CREATE TABLE foo_3 ( CHECK ( part = 3 ) ) INHERITS ( foo );
INSERT INTO foo_3 ( part, bar_id ) VALUES ( 3, 1 ), ( 3, 2 );
------------------

As you can see, table "Foo" is partitioned by column "part". If only
"Foo" is referenced in update, query plan is fine:

=> EXPLAIN UPDATE foo SET foo_data = 10 WHERE part = 2;
QUERY PLAN
------------------------------------------------------------------
Append (cost=0.00..68.50 rows=20 width=14)
-> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
-> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
(5 rows)

However, for this query it is far from being optimal:

=> EXPLAIN UPDATE foo SET foo_data = 10 FROM bar WHERE part = 2 AND
foo.bar_id = bar.bar_id;
QUERY PLAN

--------------------------------------------------------------------------------
Append (cost=0.00..nan rows=22 width=14)
-> Nested Loop (cost=0.00..73.05 rows=10 width=14)
-> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
-> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1
width=4)
Index Cond: (bar.bar_id = public.foo.bar_id)
-> Merge Join (cost=nan..nan rows=1 width=8)
Merge Cond: (public.foo.bar_id = bar.bar_id)
-> Sort (cost=0.02..0.03 rows=1 width=0)
Sort Key: public.foo.bar_id
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: false
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: bar.bar_id
-> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4)
-> Nested Loop (cost=0.00..73.05 rows=10 width=14)
-> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
-> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1
width=4)
Index Cond: (bar.bar_id = public.foo.bar_id)
-> Merge Join (cost=nan..nan rows=1 width=8)
Merge Cond: (public.foo.bar_id = bar.bar_id)
-> Sort (cost=0.02..0.03 rows=1 width=0)
Sort Key: public.foo.bar_id
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: false
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: bar.bar_id
-> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4)
(29 rows)

Is there any way to avoid this anomaly? UPDATE ... FROM is very
convenient if you have to update rows depending on conditions involving
multiple tables. In addition, with partitioned tables,
standard-conforming UPDATE foo ... WHERE pk IN (SELECT pk FROM foo,
bar...) is even worse because query planner cannot choose correct
partitions without nested select results and so it does a complete scan
of all partitions instead.

Alex

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alex Solovey (#1)
Re: Problem with update on partitioned table

Alex Solovey <a.solovey@gmail.com> writes:

We have pretty big production database (running PostgreSQL 8.3.1) with
many partitioned tables. In most cases, they work well (since 8.2.1 at
least) -- constraint exclusion is able to select correct partitions.
However, there is an exception: queries on partitioned tables using
PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension.

Hmm, the immediate problem is that cost_mergejoin is coming out with
a silly cost (NaN) because of division by zero. The attached patch
should get it back to 8.2-equivalent behavior. But really we're missing
a bet because the sub-joins ought to get discarded entirely when we know
they must be empty. There are various places testing for this but it
looks like make_join_rel() needs to do it too.

regards, tom lane

Index: src/backend/optimizer/path/costsize.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v
retrieving revision 1.191
diff -c -r1.191 costsize.c
*** src/backend/optimizer/path/costsize.c	1 Jan 2008 19:45:50 -0000	1.191
--- src/backend/optimizer/path/costsize.c	24 Mar 2008 20:55:42 -0000
***************
*** 1385,1390 ****
--- 1385,1396 ----
  	Selectivity joininfactor;
  	Path		sort_path;		/* dummy for result of cost_sort */
+ 	/* Protect some assumptions below that rowcounts aren't zero */
+ 	if (outer_path_rows <= 0)
+ 		outer_path_rows = 1;
+ 	if (inner_path_rows <= 0)
+ 		inner_path_rows = 1;
+ 
  	if (!enable_mergejoin)
  		startup_cost += disable_cost;
#3Martin Gainty
mgainty@hotmail.com
In reply to: Alex Solovey (#1)
Re: Problem with update on partitioned table

Alex-

http://www.postgresql.org/docs/8.2/static/runtime-config-query.html#GUC-CONS
TRAINT-EXCLUSION
postgresql.conf contains a constraint_exclusion parameter called
constraint_exclusion (boolean)
which if you dont want to scan ALL partitions must be set to 'on'
constraint_exclusion = on

HTH
Martin
----- Original Message -----
From: "Alex Solovey" <a.solovey@gmail.com>
To: <pgsql-general@postgresql.org>
Sent: Monday, March 24, 2008 2:15 PM
Subject: [GENERAL] Problem with update on partitioned table

Hello,

We have pretty big production database (running PostgreSQL 8.3.1) with
many partitioned tables. In most cases, they work well (since 8.2.1 at
least) -- constraint exclusion is able to select correct partitions.
However, there is an exception: queries on partitioned tables using
PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension.

Here is a simple test case:
------------------
CREATE TABLE bar ( bar_id INT NOT NULL PRIMARY KEY );
INSERT INTO bar VALUES ( 1 ), ( 2 ), ( 3 );

CREATE TABLE foo (
part INT NOT NULL
,foo_data INT
,bar_id INT NOT NULL REFERENCES bar( bar_id )
);

CREATE TABLE foo_1 ( CHECK ( part = 1 ) ) INHERITS ( foo );
INSERT INTO foo_1 ( part, bar_id ) VALUES ( 1, 1 ), ( 1, 3 );

CREATE TABLE foo_2 ( CHECK ( part = 2 ) ) INHERITS ( foo );
INSERT INTO foo_2 ( part, bar_id ) VALUES ( 2, 2 ), ( 2, 3 );

CREATE TABLE foo_3 ( CHECK ( part = 3 ) ) INHERITS ( foo );
INSERT INTO foo_3 ( part, bar_id ) VALUES ( 3, 1 ), ( 3, 2 );
------------------

As you can see, table "Foo" is partitioned by column "part". If only
"Foo" is referenced in update, query plan is fine:

=> EXPLAIN UPDATE foo SET foo_data = 10 WHERE part = 2;
QUERY PLAN
------------------------------------------------------------------
Append (cost=0.00..68.50 rows=20 width=14)
-> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
-> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
(5 rows)

However, for this query it is far from being optimal:

=> EXPLAIN UPDATE foo SET foo_data = 10 FROM bar WHERE part = 2 AND
foo.bar_id = bar.bar_id;
QUERY PLAN

--------------------------------------------------------------------------

------

Show quoted text

Append (cost=0.00..nan rows=22 width=14)
-> Nested Loop (cost=0.00..73.05 rows=10 width=14)
-> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
-> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1
width=4)
Index Cond: (bar.bar_id = public.foo.bar_id)
-> Merge Join (cost=nan..nan rows=1 width=8)
Merge Cond: (public.foo.bar_id = bar.bar_id)
-> Sort (cost=0.02..0.03 rows=1 width=0)
Sort Key: public.foo.bar_id
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: false
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: bar.bar_id
-> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4)
-> Nested Loop (cost=0.00..73.05 rows=10 width=14)
-> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14)
Filter: (part = 2)
-> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1
width=4)
Index Cond: (bar.bar_id = public.foo.bar_id)
-> Merge Join (cost=nan..nan rows=1 width=8)
Merge Cond: (public.foo.bar_id = bar.bar_id)
-> Sort (cost=0.02..0.03 rows=1 width=0)
Sort Key: public.foo.bar_id
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: false
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: bar.bar_id
-> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4)
(29 rows)

Is there any way to avoid this anomaly? UPDATE ... FROM is very
convenient if you have to update rows depending on conditions involving
multiple tables. In addition, with partitioned tables,
standard-conforming UPDATE foo ... WHERE pk IN (SELECT pk FROM foo,
bar...) is even worse because query planner cannot choose correct
partitions without nested select results and so it does a complete scan
of all partitions instead.

Alex

-
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Alex Solovey
a.solovey@gmail.com
In reply to: Martin Gainty (#3)
Re: Problem with update on partitioned table

Martin,

which if you dont want to scan ALL partitions must be set to 'on'
constraint_exclusion = on

It is 'ON'. The problem is that it does not work well for 'UPDATE foo
... FROM bar' queries, when partitioned table 'foo' is joined with
another table.

Martin Gainty wrote:

Show quoted text

Alex-

http://www.postgresql.org/docs/8.2/static/runtime-config-query.html#GUC-CONS
TRAINT-EXCLUSION
postgresql.conf contains a constraint_exclusion parameter called
constraint_exclusion (boolean)
which if you dont want to scan ALL partitions must be set to 'on'
constraint_exclusion = on

HTH
Martin

#5Alex Solovey
a.solovey@gmail.com
In reply to: Tom Lane (#2)
Re: Problem with update on partitioned table

Tom,

Thanks for the patch. We've tried it here, and it improved query plan
slightly (indeed, it looks exactly like the plan from 8.2.6 now).
But, as you've said, empty sub-joins are still not discarded, so query
execution time did not improve. And this is the same in both 8.2 and 8.3.

Note that only 'UPDATE FROM' does not discard sub-joins. 'SELECT' on the
same tables is just fine:

=> EXPLAIN SELECT * FROM foo, bar WHERE part = 2 AND foo.bar_id =
bar.bar_id;
QUERY PLAN

-----------------------------------------------------------------------------
Hash Join (cost=64.47..104.08 rows=18 width=16)
Hash Cond: (bar.bar_id = public.foo.bar_id)
-> Seq Scan on bar (cost=0.00..31.40 rows=2140 width=4)
-> Hash (cost=64.25..64.25 rows=18 width=12)
-> Append (cost=0.00..64.25 rows=18 width=12)
-> Seq Scan on foo (cost=0.00..32.12 rows=9 width=12)
Filter: (part = 2)
-> Seq Scan on foo_2 foo (cost=0.00..32.12 rows=9
width=12)
Filter: (part = 2)
(9 rows)

-- Alex

Tom Lane wrote:

Show quoted text

Hmm, the immediate problem is that cost_mergejoin is coming out with
a silly cost (NaN) because of division by zero. The attached patch
should get it back to 8.2-equivalent behavior. But really we're missing
a bet because the sub-joins ought to get discarded entirely when we know
they must be empty. There are various places testing for this but it
looks like make_join_rel() needs to do it too.

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alex Solovey (#5)
Re: Problem with update on partitioned table

Alex Solovey <a.solovey@gmail.com> writes:

Thanks for the patch. We've tried it here, and it improved query plan
slightly (indeed, it looks exactly like the plan from 8.2.6 now).
But, as you've said, empty sub-joins are still not discarded, so query
execution time did not improve. And this is the same in both 8.2 and 8.3.

If you're feeling brave, try the patch I just committed to CVS.

regards, tom lane

#7Alex Solovey
a.solovey@gmail.com
In reply to: Tom Lane (#6)
Re: Problem with update on partitioned table

Tom,

If you're feeling brave, try the patch I just committed to CVS.

I just did it. It works! According to the query plan, only one partition
is being examined now.

Is this patch going to be included in 8.3 only, or in 8.2 as well?

Thanks!

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alex Solovey (#7)
Re: Problem with update on partitioned table

Alex Solovey <a.solovey@gmail.com> writes:

Is this patch going to be included in 8.3 only, or in 8.2 as well?

I thought I was already taking a chance by putting it in REL8_3_STABLE.
Since 8.2 doesn't have a regression compared to the previous release,
and is also very much more different from HEAD, I'm not inclined to
touch it.

regards, tom lane