Have we tried to treat CTE as SubQuery in planner?

Started by Andy Fanabout 5 years ago6 messages
#1Andy Fan
zhihui.fan1213@gmail.com

Hi:

Take the following example:

insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);

explain
with cte1 as (select * from cte1)
select * from c where a = 1;

It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;

I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?

Thanks

--
Best Regards
Andy Fan

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andy Fan (#1)
Re: Have we tried to treat CTE as SubQuery in planner?

Andy Fan <zhihui.fan1213@gmail.com> writes:

Take the following example:

insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);

explain
with cte1 as (select * from cte1)
select * from c where a = 1;

It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;

This example seems both confused and out of date. Since we changed
the rules on materializing CTEs (in 608b167f9), I get

regression=# create table c as select i as a, i from generate_series(1, 1000000)i;
SELECT 1000000
regression=# create index on c(a);
CREATE INDEX
regression=# explain
regression-# with cte1 as (select * from c)
regression-# select * from cte1 where a = 1;
QUERY PLAN
--------------------------------------------------------------------------
Bitmap Heap Scan on c (cost=95.17..4793.05 rows=5000 width=8)
Recheck Cond: (a = 1)
-> Bitmap Index Scan on c_a_idx (cost=0.00..93.92 rows=5000 width=0)
Index Cond: (a = 1)
(4 rows)

regards, tom lane

#3Julien Rouhaud
rjuju123@gmail.com
In reply to: Andy Fan (#1)
Re: Have we tried to treat CTE as SubQuery in planner?

On Sat, Nov 14, 2020 at 2:04 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi:

Take the following example:

insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);

explain
with cte1 as (select * from cte1)
select * from c where a = 1;

It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;

I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?

See https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b

#4Jesse Zhang
sbjesse@gmail.com
In reply to: Andy Fan (#1)
Re: Have we tried to treat CTE as SubQuery in planner?

Hi,

On Fri, Nov 13, 2020 at 10:04 PM Andy Fan wrote:

Hi:

Take the following example:

insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);

explain
with cte1 as (select * from cte1)
select * from c where a = 1;

ITYM:

EXPLAIN
WITH c AS (SELECT * FROM cte1)
SELECT * FROM c WHERE a = 1;

I'm also guessing your table DDL is:

CREATE TABLE cte1 (a int, b int);

It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;

Does it? On HEAD, I got the following plan:

(without stats):
Bitmap Heap Scan on foo
Recheck Cond: (a = 1)
-> Bitmap Index Scan on foo_a_idx
Index Cond: (a = 1)

(with stats):
Index Scan using foo_a_idx on foo
Index Cond: (a = 1)

I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?

This was brought up a few times, the most recent one I can recall was a
little bit over two years ago [1]/messages/by-id/87sh48ffhb.fsf@news-spur.riddles.org.uk

[1]: /messages/by-id/87sh48ffhb.fsf@news-spur.riddles.org.uk

#5Andy Fan
zhihui.fan1213@gmail.com
In reply to: Tom Lane (#2)
Re: Have we tried to treat CTE as SubQuery in planner?

On Sat, Nov 14, 2020 at 2:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andy Fan <zhihui.fan1213@gmail.com> writes:

Take the following example:

insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);

explain
with cte1 as (select * from cte1)
select * from c where a = 1;

It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;

This example seems both confused and out of date. Since we changed
the rules on materializing CTEs (in 608b167f9), I get

Sorry, I should have tested it again on the HEAD, and 608b167f9 is exactly
the thing I mean.

regression=# create table c as select i as a, i from generate_series(1,

1000000)i;
SELECT 1000000
regression=# create index on c(a);
CREATE INDEX
regression=# explain
regression-# with cte1 as (select * from c)
regression-# select * from cte1 where a = 1;
QUERY PLAN
--------------------------------------------------------------------------
Bitmap Heap Scan on c (cost=95.17..4793.05 rows=5000 width=8)
Recheck Cond: (a = 1)
-> Bitmap Index Scan on c_a_idx (cost=0.00..93.92 rows=5000 width=0)
Index Cond: (a = 1)
(4 rows)

regards, tom lane

--
Best Regards
Andy Fan

#6Andy Fan
zhihui.fan1213@gmail.com
In reply to: Jesse Zhang (#4)
Re: Have we tried to treat CTE as SubQuery in planner?

On Sat, Nov 14, 2020 at 2:44 PM Jesse Zhang <sbjesse@gmail.com> wrote:

Hi,

On Fri, Nov 13, 2020 at 10:04 PM Andy Fan wrote:

Hi:

Take the following example:

insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);

explain
with cte1 as (select * from cte1)
select * from c where a = 1;

ITYM:

EXPLAIN
WITH c AS (SELECT * FROM cte1)
SELECT * FROM c WHERE a = 1;

I'm also guessing your table DDL is:

CREATE TABLE cte1 (a int, b int);

It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;

Does it? On HEAD, I got the following plan:

You understand me correctly, just too busy recently and make
me make mistakes like this. Sorry about that:(

(without stats):
Bitmap Heap Scan on foo
Recheck Cond: (a = 1)
-> Bitmap Index Scan on foo_a_idx
Index Cond: (a = 1)

(with stats):
Index Scan using foo_a_idx on foo
Index Cond: (a = 1)

I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?

This was brought up a few times, the most recent one I can recall was a
little bit over two years ago [1]

[1] /messages/by-id/87sh48ffhb.fsf@news-spur.riddles.org.uk

And I should have searched "CTE" at least for a while..

--
Best Regards
Andy Fan