patch: optimize information_schema.constraint_column_usage

Started by Alexey Bashtanovalmost 9 years ago4 messages
#1Alexey Bashtanov
bashtanov@imap.cc
2 attachment(s)

Hello hackers,

The view information_schema.constraint_column_usage becomes slow when
the number of columns and constraints raise to substantial values.
This is because of a join condition that allows only join filter to
enforce. The patch is to optimize it.
See many_constraints.sql file attached for a performance test: create
3000 tables with 10 columns and a PK each and select * from the view.
The last statement works for 22 seconds on master branch, 34
milliseconds optimized on my laptop.

Best Regards,
Alexey Bashtanov

Attachments:

many-constraints.sqlapplication/sql; name=many-constraints.sqlDownload
constraint_column_usage.1.patchtext/x-patch; name=constraint_column_usage.1.patchDownload
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 00550eb..ffb1564 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -801,8 +801,8 @@ CREATE VIEW constraint_column_usage AS
           WHERE nr.oid = r.relnamespace
             AND r.oid = a.attrelid
             AND nc.oid = c.connamespace
-            AND (CASE WHEN c.contype = 'f' THEN r.oid = c.confrelid AND a.attnum = ANY (c.confkey)
-                      ELSE r.oid = c.conrelid AND a.attnum = ANY (c.conkey) END)
+            AND r.oid = CASE c.contype WHEN 'f' THEN c.confrelid ELSE c.conrelid END
+            AND a.attnum = ANY (CASE c.contype WHEN 'f' THEN c.confkey ELSE c.conkey END)
             AND NOT a.attisdropped
             AND c.contype IN ('p', 'u', 'f')
             AND r.relkind = 'r'
#2Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Alexey Bashtanov (#1)
Re: patch: optimize information_schema.constraint_column_usage

The patch applies cleanly. No regression test fails.

On Thu, Feb 2, 2017 at 6:57 PM, Alexey Bashtanov <bashtanov@imap.cc> wrote:

Hello hackers,

The view information_schema.constraint_column_usage becomes slow when the
number of columns and constraints raise to substantial values.
This is because of a join condition that allows only join filter to enforce.
The patch is to optimize it.

Right. Because there are three relations involved in the condition,
it's evaluation is delayed till all three of them are joined, thus
increasing the sizes of minion joins. With this change, the size of
minion joins reduces. Since this is an all INNER join, splitting the
condition should be fine.

See many_constraints.sql file attached for a performance test: create 3000
tables with 10 columns and a PK each and select * from the view.
The last statement works for 22 seconds on master branch, 34 milliseconds
optimized on my laptop.

Yes, that's a lot of improvement. I can reproduce similar results on my laptop.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Ashutosh Bapat (#2)
Re: patch: optimize information_schema.constraint_column_usage

Please add this patch to the commitfest so that it's not forgotten.

On Fri, Feb 3, 2017 at 6:10 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

The patch applies cleanly. No regression test fails.

On Thu, Feb 2, 2017 at 6:57 PM, Alexey Bashtanov <bashtanov@imap.cc> wrote:

Hello hackers,

The view information_schema.constraint_column_usage becomes slow when the
number of columns and constraints raise to substantial values.
This is because of a join condition that allows only join filter to enforce.
The patch is to optimize it.

Right. Because there are three relations involved in the condition,
it's evaluation is delayed till all three of them are joined, thus
increasing the sizes of minion joins. With this change, the size of
minion joins reduces. Since this is an all INNER join, splitting the
condition should be fine.

See many_constraints.sql file attached for a performance test: create 3000
tables with 10 columns and a PK each and select * from the view.
The last statement works for 22 seconds on master branch, 34 milliseconds
optimized on my laptop.

Yes, that's a lot of improvement. I can reproduce similar results on my laptop.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Alexey Bashtanov (#1)
Re: patch: optimize information_schema.constraint_column_usage

On 2/2/17 08:27, Alexey Bashtanov wrote:

The view information_schema.constraint_column_usage becomes slow when
the number of columns and constraints raise to substantial values.
This is because of a join condition that allows only join filter to
enforce. The patch is to optimize it.

committed, thanks

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers