WITH NOT MATERIALIZED and DML CTEs

Started by Elvis Pranskevichusover 6 years ago11 messages
#1Elvis Pranskevichus
elprans@gmail.com

Currently, WITH a AS NOT MATERIALIZED (INSERT ...) would silently
disregard the "NOT MATERIALIZED" instruction and execute the data-
modifying CTE to completion (as per the long-standing DML CTE rule).

This seems like an omission to me. Ideally, the presence of an explicit
"NOT MATERIALIZED" clause on a data-modifying CTE should disable the
"run to completion" logic.

It is understandably late in the 12 cycle, so maybe prohibit NOT
MATERIALIZED with DML altogheter and revisit this in 13?

Thoughts?

Elvis

#2Andres Freund
andres@anarazel.de
In reply to: Elvis Pranskevichus (#1)
Re: WITH NOT MATERIALIZED and DML CTEs

Hi,

On 2019-06-03 11:45:51 -0400, Elvis Pranskevichus wrote:

Currently, WITH a AS NOT MATERIALIZED (INSERT ...) would silently
disregard the "NOT MATERIALIZED" instruction and execute the data-
modifying CTE to completion (as per the long-standing DML CTE rule).

This seems like an omission to me. Ideally, the presence of an explicit
"NOT MATERIALIZED" clause on a data-modifying CTE should disable the
"run to completion" logic.

I don't see us ever doing that. The result of minor costing and other
planner changes would yield different updated data. That'll just create
endless bug reports.

It is understandably late in the 12 cycle, so maybe prohibit NOT
MATERIALIZED with DML altogheter and revisit this in 13?

I could see us adding an error, or just continuing to silently ignore
it.

Greetings,

Andres Freund

#3Elvis Pranskevichus
elprans@gmail.com
In reply to: Andres Freund (#2)
Re: WITH NOT MATERIALIZED and DML CTEs

On Monday, June 3, 2019 11:50:15 A.M. EDT Andres Freund wrote:

This seems like an omission to me. Ideally, the presence of an
explicit "NOT MATERIALIZED" clause on a data-modifying CTE should
disable the "run to completion" logic.

I don't see us ever doing that. The result of minor costing and other
planner changes would yield different updated data. That'll just
create endless bug reports.

I understand why the rule exists in the first place, but I think that an
explicit opt-in signals the assumption of responsibility and opens the
possibility of using this in a well-defined evaluation context, such as
CASE WHEN.

Elvis

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Elvis Pranskevichus (#3)
Re: WITH NOT MATERIALIZED and DML CTEs

Elvis Pranskevichus <elprans@gmail.com> writes:

On Monday, June 3, 2019 11:50:15 A.M. EDT Andres Freund wrote:

This seems like an omission to me. Ideally, the presence of an
explicit "NOT MATERIALIZED" clause on a data-modifying CTE should
disable the "run to completion" logic.

I don't see us ever doing that. The result of minor costing and other
planner changes would yield different updated data. That'll just
create endless bug reports.

I understand why the rule exists in the first place, but I think that an
explicit opt-in signals the assumption of responsibility and opens the
possibility of using this in a well-defined evaluation context, such as
CASE WHEN.

TBH, if you think it's well-defined, you're wrong. I concur with
Andres that turning off run-to-completion for DMLs would be disastrous.
For just one obvious point, what about firing AFTER triggers?

It's already the case that the planner will silently ignore NOT
MATERIALIZED for other cases where it can't inline the CTE for semantic
or implementation reasons -- see comments in SS_process_ctes(). I see
no good reason to treat the DML exception much differently from other
exceptions, such as presence of volatile functions or recursion.

regards, tom lane

#5Elvis Pranskevichus
elprans@gmail.com
In reply to: Tom Lane (#4)
Re: WITH NOT MATERIALIZED and DML CTEs

On Monday, June 3, 2019 12:09:46 P.M. EDT Tom Lane wrote:

I understand why the rule exists in the first place, but I think
that an explicit opt-in signals the assumption of responsibility
and opens the possibility of using this in a well-defined
evaluation context, such as CASE WHEN.

TBH, if you think it's well-defined, you're wrong.

The documentation seems to strongly suggest otherwise:

"When it is essential to force evaluation order, a CASE construct (see
Section 9.17) can be used. ... CASE construct used in this fashion will
defeat optimization attempts"

Are there cases where this is not true outside of the documented
exceptions (i.e. immutable early-eval and aggregates)?

Elvis

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Elvis Pranskevichus (#5)
Re: WITH NOT MATERIALIZED and DML CTEs

Elvis Pranskevichus <elprans@gmail.com> writes:

On Monday, June 3, 2019 12:09:46 P.M. EDT Tom Lane wrote:

I understand why the rule exists in the first place, but I think
that an explicit opt-in signals the assumption of responsibility
and opens the possibility of using this in a well-defined
evaluation context, such as CASE WHEN.

TBH, if you think it's well-defined, you're wrong.

The documentation seems to strongly suggest otherwise:

"When it is essential to force evaluation order, a CASE construct (see
Section 9.17) can be used. ... CASE construct used in this fashion will
defeat optimization attempts"

Are there cases where this is not true outside of the documented
exceptions (i.e. immutable early-eval and aggregates)?

CASE is a scalar-expression construct. It's got little to do with
the timing of scan/join operations such as row fetches. We offer
users essentially no control over when those happen ... other than
the guarantees about CTE materialization, which are exactly what
you say you want to give up.

regards, tom lane

#7Elvis Pranskevichus
elprans@gmail.com
In reply to: Tom Lane (#6)
Re: WITH NOT MATERIALIZED and DML CTEs

On Monday, June 3, 2019 1:03:44 P.M. EDT Tom Lane wrote:

CASE is a scalar-expression construct. It's got little to do with
the timing of scan/join operations such as row fetches. We offer
users essentially no control over when those happen ... other than
the guarantees about CTE materialization, which are exactly what
you say you want to give up.

In the general case, yes, but I *can* use a scalar-returning INSERT CTE
in a THEN clause as a subquery. Useful for a conditional INSERT, when
you can't use ON CONFLICT.

Anyway, I understand that the complications are probably not worth it.

Thanks,

Elvis

#8David Fetter
david@fetter.org
In reply to: Elvis Pranskevichus (#1)
Re: WITH NOT MATERIALIZED and DML CTEs

On Mon, Jun 03, 2019 at 11:45:51AM -0400, Elvis Pranskevichus wrote:

Currently, WITH a AS NOT MATERIALIZED (INSERT ...) would silently
disregard the "NOT MATERIALIZED" instruction and execute the data-
modifying CTE to completion (as per the long-standing DML CTE rule).

This seems like an omission to me. Ideally, the presence of an explicit
"NOT MATERIALIZED" clause on a data-modifying CTE should disable the
"run to completion" logic.

It might be worth documenting the fact that NOT MATERIALIZED doesn't
affect DML CTEs, just as it doesn't affect statements with volatile
functions and recursive CTEs.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Fetter (#8)
Re: WITH NOT MATERIALIZED and DML CTEs

David Fetter <david@fetter.org> writes:

It might be worth documenting the fact that NOT MATERIALIZED doesn't
affect DML CTEs, just as it doesn't affect statements with volatile
functions and recursive CTEs.

We already do:

However, if a WITH query is non-recursive and side-effect-free (that
is, it is a SELECT containing no volatile functions) then it can be
folded into the parent query, allowing joint optimization of the two
query levels. By default, this happens if the parent query references
the WITH query just once, but not if it references the WITH query more
than once. You can override that decision by specifying MATERIALIZED
to force separate calculation of the WITH query, or by specifying NOT
MATERIALIZED to force it to be merged into the parent query. The
latter choice risks duplicate computation of the WITH query, but it
can still give a net savings if each usage of the WITH query needs
only a small part of the WITH query's full output.

regards, tom lane

#10David Fetter
david@fetter.org
In reply to: Tom Lane (#9)
Re: WITH NOT MATERIALIZED and DML CTEs

On Mon, Jun 03, 2019 at 07:33:35PM -0400, Tom Lane wrote:

David Fetter <david@fetter.org> writes:

It might be worth documenting the fact that NOT MATERIALIZED doesn't
affect DML CTEs, just as it doesn't affect statements with volatile
functions and recursive CTEs.

We already do:

However, if a WITH query is non-recursive and side-effect-free (that
is, it is a SELECT containing no volatile functions) then it can be

I guess this part makes it pretty clear that DML isn't part of the
party just yet.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#2)
Re: WITH NOT MATERIALIZED and DML CTEs

On 2019-Jun-03, Andres Freund wrote:

On 2019-06-03 11:45:51 -0400, Elvis Pranskevichus wrote:

It is understandably late in the 12 cycle, so maybe prohibit NOT
MATERIALIZED with DML altogheter and revisit this in 13?

I could see us adding an error, or just continuing to silently ignore
it.

Hmm, shouldn't we be throwing an error for that case? I'm not sure it's
defensible that we don't.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services