Re: CTE optimization fence on the todo list?
Has there been any movement on this in the last couple years?
I could really use the ability to optimize across CTE boundaries, and it
seems like a lot of other people could too.
On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
Has there been any movement on this in the last couple years?
I could really use the ability to optimize across CTE boundaries, and it
seems like a lot of other people could too.
I'm not aware that anyone is working on it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/30/15 6:35 AM, Robert Haas wrote:
On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
Has there been any movement on this in the last couple years?
I could really use the ability to optimize across CTE boundaries, and it
seems like a lot of other people could too.I'm not aware that anyone is working on it.
ISTR a comment to the effect of the SQL standard effectively requires
current behavior.
I'd still love to see a way around that though, even if it means some
kind of additional syntax; WITH is a lot nicer way to factor a query
than 10 nested subselects...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 1, 2015 at 4:53 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
ISTR a comment to the effect of the SQL standard effectively requires
current behavior.
I'd be astonished. The SQL standard doesn't even know that there is
such a thing as an index, so I presume it doesn't dictate the behavior
of the query planner either.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
On 4/30/15 6:35 AM, Robert Haas wrote:
On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
I could really use the ability to optimize across CTE boundaries, and it
seems like a lot of other people could too.
I'm not aware that anyone is working on it.
ISTR a comment to the effect of the SQL standard effectively requires
current behavior.
I doubt that the spec says anything about it one way or another.
However, there are a lot of cases where we definitely can't push
constraints into a WITH:
* Data-modifying query in the WITH, eg UPDATE RETURNING --- pushing
outer constraints into it would change the set of rows updated.
* Multiply-referenced WITH item (unless the outer query applies
identical constraints to each reference, which seems silly and not
worth the cycles to check for).
* Recursive WITH item (well, maybe in some cases you could push down a
clause and not change the results, but it seems very hard to analyze).
So initially we just punted and didn't consider flattening WITHs at
all. I'm not sure to what extent people are now expecting that behavior
and would be annoyed if we changed it.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 1, 2015 at 2:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, May 1, 2015 at 4:53 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
ISTR a comment to the effect of the SQL standard effectively requires
current behavior.I'd be astonished. The SQL standard doesn't even know that there is
such a thing as an index, so I presume it doesn't dictate the behavior
of the query planner either.
I agree. Somehow, the idea that this is within the standard caught on,
but I'm almost certain it's false.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/1/15 5:39 PM, Tom Lane wrote:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
On 4/30/15 6:35 AM, Robert Haas wrote:
On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
I could really use the ability to optimize across CTE boundaries, and it
seems like a lot of other people could too.I'm not aware that anyone is working on it.
ISTR a comment to the effect of the SQL standard effectively requires
current behavior.I doubt that the spec says anything about it one way or another.
However, there are a lot of cases where we definitely can't push
constraints into a WITH:
* Data-modifying query in the WITH, eg UPDATE RETURNING --- pushing
outer constraints into it would change the set of rows updated.
* Multiply-referenced WITH item (unless the outer query applies
identical constraints to each reference, which seems silly and not
worth the cycles to check for).
* Recursive WITH item (well, maybe in some cases you could push down a
clause and not change the results, but it seems very hard to analyze).So initially we just punted and didn't consider flattening WITHs at
all. I'm not sure to what extent people are now expecting that behavior
and would be annoyed if we changed it.
I use CTEs for both organizational purposes and as optimization barriers
(in preference to using temp tables, when possible).
I'd definitely prefer to keep the barriers in place by default, perhaps
with a keyword to allow optimization across boundaries when appropriate.
However, when I really need optimization across boundaries I just use a
subquery.
It doesn't seem like there's much to be gained in terms of net
functionality.
--
- David Steele
david@pgmasters.net
David Steele <david@pgmasters.net> writes:
On 5/1/15 5:39 PM, Tom Lane wrote:
I doubt that the spec says anything about it one way or another.
However, there are a lot of cases where we definitely can't push
constraints into a WITH:
* Data-modifying query in the WITH, eg UPDATE RETURNING --- pushing
outer constraints into it would change the set of rows updated.
* Multiply-referenced WITH item (unless the outer query applies
identical constraints to each reference, which seems silly and not
worth the cycles to check for).
* Recursive WITH item (well, maybe in some cases you could push down a
clause and not change the results, but it seems very hard to analyze).So initially we just punted and didn't consider flattening WITHs at
all. I'm not sure to what extent people are now expecting that behavior
and would be annoyed if we changed it.
I use CTEs for both organizational purposes and as optimization barriers
(in preference to using temp tables, when possible).
I'd definitely prefer to keep the barriers in place by default, perhaps
with a keyword to allow optimization across boundaries when appropriate.
However, when I really need optimization across boundaries I just use a
subquery.
FWIW, a bit of thought suggests that it would not take a lot of code to
handle this: you'd just have to check the conditions mentioned above
and then convert the RTE_CTE item into an RTE_SUBQUERY, much like
inline_set_returning_functions does with RTE_FUNCTION items.
Assuming that that sketch is accurate, it would take more code to provide
a new user-visible knob to enable/disable the behavior than it would to
implement the optimization, which makes me pretty much -1 on providing
such a knob. We should either do it or not. If we do, people who want
optimization fences should use the traditional "OFFSET 0" hack.
(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 1, 2015 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Assuming that that sketch is accurate, it would take more code to provide
a new user-visible knob to enable/disable the behavior than it would to
implement the optimization, which makes me pretty much -1 on providing
such a knob. We should either do it or not. If we do, people who want
optimization fences should use the traditional "OFFSET 0" hack.
+1
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/1/15 6:32 PM, Peter Geoghegan wrote:
On Fri, May 1, 2015 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Assuming that that sketch is accurate, it would take more code to provide
a new user-visible knob to enable/disable the behavior than it would to
implement the optimization, which makes me pretty much -1 on providing
such a knob. We should either do it or not. If we do, people who want
optimization fences should use the traditional "OFFSET 0" hack.+1
Not sure if I'm thrilled with the "OFFSET 0" hack but I guess it's not
much different from the CTE hack I've been using.
An "enable_cte_optimization" GUC would serve to keep old code from
breaking while giving new users/queries the advantage of optimization.
I'm not sure it's worth adding the complexity, though. In my experience
not that many developers use CTEs.
--
- David Steele
david@pgmasters.net
On 05/01/2015 03:30 PM, Tom Lane wrote:
Assuming that that sketch is accurate, it would take more code to provide
a new user-visible knob to enable/disable the behavior than it would to
implement the optimization, which makes me pretty much -1 on providing
such a knob. We should either do it or not. If we do, people who want
optimization fences should use the traditional "OFFSET 0" hack.
Yes.
(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)
We'd need the GUC. I know of a lot of cases where people are using WITH
clauses specifically to override the query planner, and requiring them
to edit all of their queries in order to enable the old behavior would
become an upgrade barrier.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMcb1bf737ad9f62bfffe194a78293a06d4924dbd0a6cc2de5061ced0534b644b6e852b0ca1d8115f8c58f5a6a06ba72c5@asav-3.01.com
On Fri, May 1, 2015 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Assuming that that sketch is accurate, it would take more code to provide
a new user-visible knob to enable/disable the behavior than it would to
implement the optimization, which makes me pretty much -1 on providing
such a knob. We should either do it or not. If we do, people who want
optimization fences should use the traditional "OFFSET 0" hack.(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)
+1 to both. The default should be to allow the user to choose between CTE
and inline subqueries for style reasons alone - as much as possible since
you cannot have a correlated CTE nor a recursive subquery.
Trust in the planner, the planner is good. If it isn't then requiring
OFFSET 0 as the only means to create an optimization fence seems reasonable.
I like the GUC as an cheap means to keep the status-quo for those who
desire it.
While the idea of overriding the status-quo on a per-query basis has some
appeal the apparent cost-benefit ratio doesn't seem convincing.
David J.
On 05/01/2015 07:24 PM, Josh Berkus wrote:
O
(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)We'd need the GUC. I know of a lot of cases where people are using WITH
clauses specifically to override the query planner, and requiring them
to edit all of their queries in order to enable the old behavior would
become an upgrade barrier.
+100
This could be a very bad, almost impossible to catch, behaviour break.
Even if we add the GUC, we're probably going to be imposing very
significant code audit costs on some users.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
On 05/01/2015 07:24 PM, Josh Berkus wrote:
(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)
This could be a very bad, almost impossible to catch, behaviour break.
Even if we add the GUC, we're probably going to be imposing very
significant code audit costs on some users.
On what grounds do you claim it'd be a behavior break? It's possible
that the subquery flattening would result in less-desirable plans not
more-desirable ones, but the results should still be correct.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/03/2015 11:49 AM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
On 05/01/2015 07:24 PM, Josh Berkus wrote:
(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)This could be a very bad, almost impossible to catch, behaviour break.
Even if we add the GUC, we're probably going to be imposing very
significant code audit costs on some users.On what grounds do you claim it'd be a behavior break? It's possible
that the subquery flattening would result in less-desirable plans not
more-desirable ones, but the results should still be correct.
I meant w.r.t. performance. Sorry if that wasn't clear.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/3/15 11:59 AM, Andrew Dunstan wrote:
On 05/03/2015 11:49 AM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
On 05/01/2015 07:24 PM, Josh Berkus wrote:
(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a
reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)This could be a very bad, almost impossible to catch, behaviour break.
Even if we add the GUC, we're probably going to be imposing very
significant code audit costs on some users.On what grounds do you claim it'd be a behavior break? It's possible
that the subquery flattening would result in less-desirable plans not
more-desirable ones, but the results should still be correct.I meant w.r.t. performance. Sorry if that wasn't clear.
To put this in perspective... I've seen things like this take query
runtime from minutes to multiple hours or worse; bad enough that
"behavior break" becomes a valid description.
We definitely need to highlight this in the release notes, and I think
the GUC would be mandatory.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I need this feature a lot. Can anyone point me to a place in the code
where I can hack together a quick-and-dirty, compatibility-breaking
implementation? Thanks!
On Sun, May 3, 2015 at 10:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
Show quoted text
On 5/3/15 11:59 AM, Andrew Dunstan wrote:
On 05/03/2015 11:49 AM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
On 05/01/2015 07:24 PM, Josh Berkus wrote:
(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a
reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)This could be a very bad, almost impossible to catch, behaviour break.
Even if we add the GUC, we're probably going to be imposing very
significant code audit costs on some users.On what grounds do you claim it'd be a behavior break? It's possible
that the subquery flattening would result in less-desirable plans not
more-desirable ones, but the results should still be correct.I meant w.r.t. performance. Sorry if that wasn't clear.
To put this in perspective... I've seen things like this take query
runtime from minutes to multiple hours or worse; bad enough that "behavior
break" becomes a valid description.We definitely need to highlight this in the release notes, and I think the
GUC would be mandatory.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 20, 2015 at 12:58 AM, Chris Rogers <teukros@gmail.com> wrote:
I need this feature a lot. Can anyone point me to a place in the code where
I can hack together a quick-and-dirty, compatibility-breaking
implementation? Thanks!
Does this help?
/messages/by-id/38448.1430519406@sss.pgh.pa.us
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 1, 2015 at 2:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
* Multiply-referenced WITH item (unless the outer query applies
identical constraints to each reference, which seems silly and not
worth the cycles to check for).
Not sure if I understand this correctly. Look at this query, CTE q is
referenced twice and it is obviously expand it helps:
postgres=# explain with q as (select * from a) select * from q q1 join
q q2 on q1.i=q2.i where q1.i <= 10 and q2.i >=2;
QUERY PLAN
----------------------------------------------------------------------
Nested Loop (cost=1443.59..1526.35 rows=9 width=16)
CTE q
-> Seq Scan on a a_2 (cost=0.00..1443.00 rows=100000 width=8)
-> Index Scan using ai on a (cost=0.29..8.45 rows=9 width=8)
Index Cond: (i <= 10)
-> Index Scan using ai on a a_1 (cost=0.29..8.31 rows=1 width=8)
Index Cond: ((i = a.i) AND (i >= 2))
(7 rows)
Another question is that CTEs might be used as an optimization fence.
Think about a query like this:
WITH q1 as /* 5 table joins */, q2 as /* 5 tables join */, q3 ...,
q4 SELECT ...
If we expand all CTEs, we may end up optimize join with many tables
(could be bad). Or it is possible that users intentionally arrange
join in that way (more or less like hints) to override the optimizer.
We could look at geqo_threshold and decide how shall we expand, but
this may not be better than a GUC variable.
Regards,
Qingqing
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers