Have we tried to treat CTE as SubQuery in planner?
Hi:
Take the following example:
insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);
explain
with cte1 as (select * from cte1)
select * from c where a = 1;
It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;
I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?
Thanks
--
Best Regards
Andy Fan
Andy Fan <zhihui.fan1213@gmail.com> writes:
Take the following example:
insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);
explain
with cte1 as (select * from cte1)
select * from c where a = 1;
It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;
This example seems both confused and out of date. Since we changed
the rules on materializing CTEs (in 608b167f9), I get
regression=# create table c as select i as a, i from generate_series(1, 1000000)i;
SELECT 1000000
regression=# create index on c(a);
CREATE INDEX
regression=# explain
regression-# with cte1 as (select * from c)
regression-# select * from cte1 where a = 1;
QUERY PLAN
--------------------------------------------------------------------------
Bitmap Heap Scan on c (cost=95.17..4793.05 rows=5000 width=8)
Recheck Cond: (a = 1)
-> Bitmap Index Scan on c_a_idx (cost=0.00..93.92 rows=5000 width=0)
Index Cond: (a = 1)
(4 rows)
regards, tom lane
On Sat, Nov 14, 2020 at 2:04 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
Hi:
Take the following example:
insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);explain
with cte1 as (select * from cte1)
select * from c where a = 1;It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?
Hi,
On Fri, Nov 13, 2020 at 10:04 PM Andy Fan wrote:
Hi:
Take the following example:
insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);explain
with cte1 as (select * from cte1)
select * from c where a = 1;
ITYM:
EXPLAIN
WITH c AS (SELECT * FROM cte1)
SELECT * FROM c WHERE a = 1;
I'm also guessing your table DDL is:
CREATE TABLE cte1 (a int, b int);
It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;
Does it? On HEAD, I got the following plan:
(without stats):
Bitmap Heap Scan on foo
Recheck Cond: (a = 1)
-> Bitmap Index Scan on foo_a_idx
Index Cond: (a = 1)
(with stats):
Index Scan using foo_a_idx on foo
Index Cond: (a = 1)
I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?
This was brought up a few times, the most recent one I can recall was a
little bit over two years ago [1]/messages/by-id/87sh48ffhb.fsf@news-spur.riddles.org.uk
[1]: /messages/by-id/87sh48ffhb.fsf@news-spur.riddles.org.uk
On Sat, Nov 14, 2020 at 2:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
Take the following example:
insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);explain
with cte1 as (select * from cte1)
select * from c where a = 1;It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;This example seems both confused and out of date. Since we changed
the rules on materializing CTEs (in 608b167f9), I get
Sorry, I should have tested it again on the HEAD, and 608b167f9 is exactly
the thing I mean.
regression=# create table c as select i as a, i from generate_series(1,
1000000)i;
SELECT 1000000
regression=# create index on c(a);
CREATE INDEX
regression=# explain
regression-# with cte1 as (select * from c)
regression-# select * from cte1 where a = 1;
QUERY PLAN
--------------------------------------------------------------------------
Bitmap Heap Scan on c (cost=95.17..4793.05 rows=5000 width=8)
Recheck Cond: (a = 1)
-> Bitmap Index Scan on c_a_idx (cost=0.00..93.92 rows=5000 width=0)
Index Cond: (a = 1)
(4 rows)regards, tom lane
--
Best Regards
Andy Fan
On Sat, Nov 14, 2020 at 2:44 PM Jesse Zhang <sbjesse@gmail.com> wrote:
Hi,
On Fri, Nov 13, 2020 at 10:04 PM Andy Fan wrote:
Hi:
Take the following example:
insert into cte1 select i, i from generate_series(1, 1000000)i;
create index on cte1(a);explain
with cte1 as (select * from cte1)
select * from c where a = 1;ITYM:
EXPLAIN
WITH c AS (SELECT * FROM cte1)
SELECT * FROM c WHERE a = 1;I'm also guessing your table DDL is:
CREATE TABLE cte1 (a int, b int);
It needs to do seq scan on the above format, however it is pretty
quick if we change the query to
select * from (select * from cte1) c where a = 1;Does it? On HEAD, I got the following plan:
You understand me correctly, just too busy recently and make
me make mistakes like this. Sorry about that:(
(without stats):
Bitmap Heap Scan on foo
Recheck Cond: (a = 1)
-> Bitmap Index Scan on foo_a_idx
Index Cond: (a = 1)(with stats):
Index Scan using foo_a_idx on foo
Index Cond: (a = 1)
I know how we treat cte and subqueries differently currently,
I just don't know why we can't treat cte as a subquery, so lots of
subquery related technology can apply to it. Do we have any
discussion about this?This was brought up a few times, the most recent one I can recall was a
little bit over two years ago [1]
And I should have searched "CTE" at least for a while..
--
Best Regards
Andy Fan