Writeable CTEs and empty relations

Started by Marko Tiikkajaabout 16 years ago27 messageshackers
Jump to latest
#1Marko Tiikkaja
marko@joh.to

Hi,

While playing around with another issue with the patch, I came across
the following:

=> create table foo(a int);
CREATE TABLE
=> with t as (insert into foo values(0)) select * from foo;
a
---
(0 rows)

I traced this down to heapam.c, which has this:

/*
* return null immediately if relation is empty
*/
if (scan->rs_nblocks == 0)
{
Assert(!BufferIsValid(scan->rs_cbuf));
tuple->t_data = NULL;
return;
}

and

/*
* Determine the number of blocks we have to scan.
*
* It is sufficient to do this once at scan start, since any tuples added
* while the scan is in progress will be invisible to my snapshot anyway.
* (That is not true when using a non-MVCC snapshot. However, we couldn't
* guarantee to return tuples added after scan start anyway, since they
* might go into pages we already scanned. To guarantee consistent
* results for a non-MVCC snapshot, the caller must hold some higher-level
* lock that ensures the interesting tuple(s) won't change.)
*/
scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);

This doesn't exactly work anymore since we modify the snapshot after
calling ExecInitScan(). I'm not really familiar with this part of the
code, so I'm asking: is there a simple enough way around this? Would
updating scan->rs_nblocks before scanning the first tuple be OK?

Regards,
Marko Tiikkaja

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marko Tiikkaja (#1)
Re: Writeable CTEs and empty relations

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

I traced this down to heapam.c, which has this:
...
This doesn't exactly work anymore since we modify the snapshot after
calling ExecInitScan().

So don't do that. The executor is generally entitled to assume that
parameters given to ExecutorStart are correct. In particular, changing
the snapshot after the query has started to run seems to me to ensure
all sorts of inconsistent and undesirable behavior.

regards, tom lane

#3Marko Tiikkaja
marko@joh.to
In reply to: Tom Lane (#2)
Re: Writeable CTEs and empty relations

On 2010-02-09 01:09 +0200, Tom Lane wrote:

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

I traced this down to heapam.c, which has this:
...
This doesn't exactly work anymore since we modify the snapshot after
calling ExecInitScan().

So don't do that. The executor is generally entitled to assume that
parameters given to ExecutorStart are correct. In particular, changing
the snapshot after the query has started to run seems to me to ensure
all sorts of inconsistent and undesirable behavior.

You do remember that the whole patch in its current form depends on
modifying the snapshot?

Regards,
Marko Tiikkaja

#4Marko Tiikkaja
marko@joh.to
In reply to: Marko Tiikkaja (#1)
Re: Writeable CTEs and empty relations

On 2010-02-08 21:30 +0200, I wrote:

This doesn't exactly work anymore since we modify the snapshot after
calling ExecInitScan(). I'm not really familiar with this part of the
code, so I'm asking: is there a simple enough way around this? Would
updating scan->rs_nblocks before scanning the first tuple be OK?

I've looked at this some more, and the problem is a lot bigger than I
originally thought. We'd basically be forced to do another initscan()
before starting a new scan after the snapshot changed. One way to
accomplish this would be that ExecutePlan() would leave a flag in EState
whenever the scan nodes need to reinit.

Does this sound completely unacceptable?

Regards,
Marko Tiikkaja

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marko Tiikkaja (#4)
Re: Writeable CTEs and empty relations

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

On 2010-02-08 21:30 +0200, I wrote:

This doesn't exactly work anymore since we modify the snapshot after
calling ExecInitScan(). I'm not really familiar with this part of the
code, so I'm asking: is there a simple enough way around this? Would
updating scan->rs_nblocks before scanning the first tuple be OK?

I've looked at this some more, and the problem is a lot bigger than I
originally thought. We'd basically be forced to do another initscan()
before starting a new scan after the snapshot changed. One way to
accomplish this would be that ExecutePlan() would leave a flag in EState
whenever the scan nodes need to reinit.

Does this sound completely unacceptable?

You still haven't explained why it's a good idea to change the snapshot
after the executor has started. Right at the moment I'm prepared to
reject the patch on that ground alone.

regards, tom lane

#6Marko Tiikkaja
marko@joh.to
In reply to: Tom Lane (#5)
Re: Writeable CTEs and empty relations

On 2010-02-10 02:19 +0200, Tom Lane wrote:

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

Does this sound completely unacceptable?

You still haven't explained why it's a good idea to change the snapshot
after the executor has started. Right at the moment I'm prepared to
reject the patch on that ground alone.

The patch only touches the snapshot's curcid. That's needed to allow
the queries see the tuples of the DML WITHs ran before that particular
query.

Regards,
Marko Tiikkaja

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marko Tiikkaja (#6)
Re: Writeable CTEs and empty relations

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

On 2010-02-10 02:19 +0200, Tom Lane wrote:

You still haven't explained why it's a good idea to change the snapshot
after the executor has started. Right at the moment I'm prepared to
reject the patch on that ground alone.

The patch only touches the snapshot's curcid. That's needed to allow
the queries see the tuples of the DML WITHs ran before that particular
query.

... and they will also see, for example, tuples inserted by triggers
fired by those queries. I thought the plan for this feature was to
store and re-read the RETURNING data, not to go back and re-read the
underlying tables.

regards, tom lane

#8Marko Tiikkaja
marko@joh.to
In reply to: Tom Lane (#7)
Re: Writeable CTEs and empty relations

On 2010-02-10 03:20 +0200, Tom Lane wrote:

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

On 2010-02-10 02:19 +0200, Tom Lane wrote:

You still haven't explained why it's a good idea to change the snapshot
after the executor has started. Right at the moment I'm prepared to
reject the patch on that ground alone.

The patch only touches the snapshot's curcid. That's needed to allow
the queries see the tuples of the DML WITHs ran before that particular
query.

... and they will also see, for example, tuples inserted by triggers
fired by those queries. I thought the plan for this feature was to
store and re-read the RETURNING data, not to go back and re-read the
underlying tables.

I originally suggested this approach about four months ago. During this
whole time you haven't expressed any concerns about this, but suddenly
it's a reason to reject the patch?

Anyway, if we want to avoid modifying the snapshot, we can't bump the
CID between queries. I haven't thought this through, but it seems like
this could work. However, none of the WITH queries can see the previous
statement's tuples, which means that someone may be suprised when they
try to modify the same tuples twice just to discover that only the first
modification took place. I don't see this as a huge problem though.

This will also solve the problem I started this thread for and makes the
patch smaller, simpler and even a bit prettier. Unless someone sees a
problem with this approach, I'm going to make the change.

Regards,
Marko Tiikkaja

#9Robert Haas
robertmhaas@gmail.com
In reply to: Marko Tiikkaja (#8)
Re: Writeable CTEs and empty relations

On Wed, Feb 10, 2010 at 5:05 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

On 2010-02-10 03:20 +0200, Tom Lane wrote:

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

On 2010-02-10 02:19 +0200, Tom Lane wrote:

You still haven't explained why it's a good idea to change the snapshot
after the executor has started.  Right at the moment I'm prepared to
reject the patch on that ground alone.

The patch only touches the snapshot's curcid.  That's needed to allow
the queries see the tuples of the DML WITHs ran before that particular
query.

... and they will also see, for example, tuples inserted by triggers
fired by those queries.  I thought the plan for this feature was to
store and re-read the RETURNING data, not to go back and re-read the
underlying tables.

I originally suggested this approach about four months ago.  During this
whole time you haven't expressed any concerns about this, but suddenly
it's a reason to reject the patch?

Anyway, if we want to avoid modifying the snapshot, we can't bump the
CID between queries.  I haven't thought this through, but it seems like
this could work.  However, none of the WITH queries can see the previous
statement's tuples, which means that someone may be suprised when they
try to modify the same tuples twice just to discover that only the first
modification took place.  I don't see this as a huge problem though.

I think that we agreed previously that running all the DML queries to
completion before the main query, bumping the CID after each one, and
saving the outputs, was the only workable approach.

http://archives.postgresql.org/pgsql-hackers/2009-10/msg00566.php
http://archives.postgresql.org/pgsql-hackers/2009-10/msg00614.php

If the executor has buried in it the assumption that the snapshot
can't change after startup, then does that mean that we need to start
up and shut down the executor for each subquery?

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01964.php

...Robert

#10Marko Tiikkaja
marko@joh.to
In reply to: Robert Haas (#9)
Re: Writeable CTEs and empty relations

On 2010-02-10 21:59 +0200, Robert Haas wrote:

On Wed, Feb 10, 2010 at 5:05 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

On 2010-02-10 03:20 +0200, Tom Lane wrote:

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:

On 2010-02-10 02:19 +0200, Tom Lane wrote:

You still haven't explained why it's a good idea to change the snapshot
after the executor has started. Right at the moment I'm prepared to
reject the patch on that ground alone.

The patch only touches the snapshot's curcid. That's needed to allow
the queries see the tuples of the DML WITHs ran before that particular
query.

... and they will also see, for example, tuples inserted by triggers
fired by those queries. I thought the plan for this feature was to
store and re-read the RETURNING data, not to go back and re-read the
underlying tables.

I originally suggested this approach about four months ago. During this
whole time you haven't expressed any concerns about this, but suddenly
it's a reason to reject the patch?

Anyway, if we want to avoid modifying the snapshot, we can't bump the
CID between queries. I haven't thought this through, but it seems like
this could work. However, none of the WITH queries can see the previous
statement's tuples, which means that someone may be suprised when they
try to modify the same tuples twice just to discover that only the first
modification took place. I don't see this as a huge problem though.

I think that we agreed previously that running all the DML queries to
completion before the main query, bumping the CID after each one, and
saving the outputs, was the only workable approach.

http://archives.postgresql.org/pgsql-hackers/2009-10/msg00566.php

Hmm. Yeah. Didn't think about that :-(

If the executor has buried in it the assumption that the snapshot
can't change after startup, then does that mean that we need to start
up and shut down the executor for each subquery?

Then I guess that's pretty much the only option.

Regards,
Marko Tiikkaja

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#9)
Re: Writeable CTEs and empty relations

Robert Haas <robertmhaas@gmail.com> writes:

If the executor has buried in it the assumption that the snapshot
can't change after startup, then does that mean that we need to start
up and shut down the executor for each subquery?

Yes, I think so. That's the way it's always worked in the past;
see for example PortalRunMulti() and ProcessQuery(). I think trying
to change that is a high-risk, low-reward activity.

This probably means that the planner output for queries involving
writeable CTEs has to be a separate PlannedStmt per such CTE.

regards, tom lane

#12Marko Tiikkaja
marko@joh.to
In reply to: Tom Lane (#11)
Re: Writeable CTEs and empty relations

On 2010-02-10 23:57 +0200, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

If the executor has buried in it the assumption that the snapshot
can't change after startup, then does that mean that we need to start
up and shut down the executor for each subquery?

Yes, I think so. That's the way it's always worked in the past;
see for example PortalRunMulti() and ProcessQuery(). I think trying
to change that is a high-risk, low-reward activity.

This probably means that the planner output for queries involving
writeable CTEs has to be a separate PlannedStmt per such CTE.

I'm looking at this, but I can't think of any good way of associating
the tuplestores from PortalRunMulti() with the correct CTEs. Any ideas?

Regards,
Marko Tiikkaja

#13Marko Tiikkaja
marko@joh.to
In reply to: Marko Tiikkaja (#12)
Re: Writeable CTEs and empty relations

On 2010-02-11 00:50 +0200, Marko Tiikkaja wrote:

On 2010-02-10 23:57 +0200, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

If the executor has buried in it the assumption that the snapshot
can't change after startup, then does that mean that we need to start
up and shut down the executor for each subquery?

Yes, I think so. That's the way it's always worked in the past;
see for example PortalRunMulti() and ProcessQuery(). I think trying
to change that is a high-risk, low-reward activity.

This probably means that the planner output for queries involving
writeable CTEs has to be a separate PlannedStmt per such CTE.

I'm looking at this, but I can't think of any good way of associating
the tuplestores from PortalRunMulti() with the correct CTEs. Any ideas?

Ok, what about the following:
- after planning the original query, standard_planner() goes through
the list of top-level CTEs and assigns a running number for each of
the DML WITHs, in the order they will be executed and returns a
list of PlannedStmts. all necessary statements wi have a flag
signaling that the result should be stored in a tuplestore.
- the portal keeps the list of tuplestores around and passes that
list to every query through PlannedStmt. it keeps on executing
the statements until it finds a PlannedStmt that doesn't want its
results stored anywhere and then frees the list of tuplestores

Does this sound reasonable? And more importantly, am I going to be
wasting my time implementing this? The 15th isn't that far away..

Regards,
Marko Tiikkaja

#14Robert Haas
robertmhaas@gmail.com
In reply to: Marko Tiikkaja (#13)
Re: Writeable CTEs and empty relations

On Wed, Feb 10, 2010 at 6:25 PM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

On 2010-02-11 00:50 +0200, Marko Tiikkaja wrote:

On 2010-02-10 23:57 +0200, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

If the executor has buried in it the assumption that the snapshot
can't change after startup, then does that mean that we need to start
up and shut down the executor for each subquery?

Yes, I think so.  That's the way it's always worked in the past;
see for example PortalRunMulti() and ProcessQuery().  I think trying
to change that is a high-risk, low-reward activity.

This probably means that the planner output for queries involving
writeable CTEs has to be a separate PlannedStmt per such CTE.

I'm looking at this, but I can't think of any good way of associating
the tuplestores from PortalRunMulti() with the correct CTEs.  Any ideas?

Ok, what about the following:
 - after planning the original query, standard_planner() goes through
   the list of top-level CTEs and assigns a running number for each of
   the DML WITHs, in the order they will be executed and returns a
   list of PlannedStmts.  all necessary statements wi have a flag
   signaling that the result should be stored in a tuplestore.
 - the portal keeps the list of tuplestores around and passes that
   list to every query through PlannedStmt.  it keeps on executing
   the statements until it finds a PlannedStmt that doesn't want its
   results stored anywhere and then frees the list of tuplestores

Wouldn't you'd want to stop when you ran out of PlannedStmts, not when
you hit one that didn't need its results stored? What happens if
there's an unreferenced CTE in the middle of the list?

Does this sound reasonable?  And more importantly, am I going to be
wasting my time implementing this?  The 15th isn't that far away..

I have to admit I've been starting to have a feeling over the last
couple of days that this patch isn't going to make it for 9.0... but
obviously I'm in no position to guarantee anything one way or the
other. Please do keep us up to date on your plans, though.

Thanks,

...Robert

#15Marko Tiikkaja
marko@joh.to
In reply to: Robert Haas (#14)
Re: Writeable CTEs and empty relations

On 2010-02-11 01:58 +0200, Robert Haas wrote:

On Wed, Feb 10, 2010 at 6:25 PM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

On 2010-02-11 00:50 +0200, Marko Tiikkaja wrote:
Ok, what about the following:
- after planning the original query, standard_planner() goes through
the list of top-level CTEs and assigns a running number for each of
the DML WITHs, in the order they will be executed and returns a
list of PlannedStmts. all necessary statements wi have a flag
signaling that the result should be stored in a tuplestore.
- the portal keeps the list of tuplestores around and passes that
list to every query through PlannedStmt. it keeps on executing
the statements until it finds a PlannedStmt that doesn't want its
results stored anywhere and then frees the list of tuplestores

Wouldn't you'd want to stop when you ran out of PlannedStmts, not when
you hit one that didn't need its results stored? What happens if
there's an unreferenced CTE in the middle of the list?

Right, of course.

Does this sound reasonable? And more importantly, am I going to be
wasting my time implementing this? The 15th isn't that far away..

I have to admit I've been starting to have a feeling over the last
couple of days that this patch isn't going to make it for 9.0... but
obviously I'm in no position to guarantee anything one way or the
other. Please do keep us up to date on your plans, though.

Unfortunately, so have I. I'll take a shot at implementing this, but if
I come across any problems, I have to give up.

Regards,
Marko Tiikkaja

#16Marko Tiikkaja
marko@joh.to
In reply to: Marko Tiikkaja (#15)
Re: Writeable CTEs and empty relations

On 2010-02-11 02:13 +0200, I wrote:

On 2010-02-11 01:58 +0200, Robert Haas wrote:

I have to admit I've been starting to have a feeling over the last
couple of days that this patch isn't going to make it for 9.0... but
obviously I'm in no position to guarantee anything one way or the
other. Please do keep us up to date on your plans, though.

Unfortunately, so have I. I'll take a shot at implementing this, but if
I come across any problems, I have to give up.

I'm going to have to disappoint a bunch of people and give up. :-(

At this point, I've already used a huge amount of my time and energy to
work on this patch during this commit fest, and I simply can't continue
like this. There's only 4 days left anyway, and I don't feel confident
enough to spend a lot of time during the next couple of days just to get
one more shot. I don't even have any kind of certainty that there
would've been a shot, even if I finished the patch on time.

Thanks everyone for helping out and reviewing this, especially Robert
and Tom.

Regards,
Marko Tiikkaja

#17Marko Tiikkaja
marko@joh.to
In reply to: Marko Tiikkaja (#16)
Re: Writeable CTEs and empty relations

On 2010-02-11 03:44 +0200, I wrote:

I'm going to have to disappoint a bunch of people and give up. :-(

Btw. would it make sense to apply the WITH-on-top-of-DML part of this
patch? At least to me, this seems useful because you can write a
RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that CTE.

Regards,
Marko Tiikkaja

#18Robert Haas
robertmhaas@gmail.com
In reply to: Marko Tiikkaja (#17)
Re: Writeable CTEs and empty relations

On Thu, Feb 11, 2010 at 8:46 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

On 2010-02-11 03:44 +0200, I wrote:

I'm going to have to disappoint a bunch of people and give up. :-(

Btw. would it make sense to apply the WITH-on-top-of-DML part of this
patch?  At least to me, this seems useful because you can write a
RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that CTE.

Hmm, that's a thought. Can you split out just that part?

...Robert

#19Marko Tiikkaja
marko@joh.to
In reply to: Robert Haas (#18)
Re: Writeable CTEs and empty relations

On Thu, 11 Feb 2010 10:53:22 -0500, Robert Haas <robertmhaas@gmail.com>
wrote:

On Thu, Feb 11, 2010 at 8:46 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

On 2010-02-11 03:44 +0200, I wrote:

I'm going to have to disappoint a bunch of people and give up. :-(

Btw. would it make sense to apply the WITH-on-top-of-DML part of this
patch?  At least to me, this seems useful because you can write a
RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that
CTE.

Hmm, that's a thought. Can you split out just that part?

Here's the patch. It's the same as the stuff in writeable CTE patches, but
I added regression tests.

Regards,
Marko Tiikkaja

Attachments:

with_on_dml.patchtext/plain; charset=UTF-8; name=with_on_dml.patchDownload+117-39
#20Marko Tiikkaja
marko@joh.to
In reply to: Marko Tiikkaja (#19)
Re: Writeable CTEs and empty relations

On Thu, 11 Feb 2010 19:28:28 +0200, I wrote:

On Thu, 11 Feb 2010 10:53:22 -0500, Robert Haas <robertmhaas@gmail.com>
wrote:

On Thu, Feb 11, 2010 at 8:46 AM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

On 2010-02-11 03:44 +0200, I wrote:

I'm going to have to disappoint a bunch of people and give up. :-(

Btw. would it make sense to apply the WITH-on-top-of-DML part of this
patch?  At least to me, this seems useful because you can write a
RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that
CTE.

Hmm, that's a thought. Can you split out just that part?

Here's the patch. It's the same as the stuff in writeable CTE patches,

but

I added regression tests.

Whoops. The reference section in docs still had some traces of writeable
CTEs. Updated patch attached.

Regards,
Marko Tiikkaja

Attachments:

with_on_dml2.patchtext/plain; charset=UTF-8; name=with_on_dml2.patchDownload+195-37
#21Robert Haas
robertmhaas@gmail.com
In reply to: Marko Tiikkaja (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#21)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marko Tiikkaja (#20)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#25)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#26)