Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Hello all.
Please, approve my ideas for implementation.
Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DO
MS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspx
MS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspx
Oracle support this feature:
http://www.compshack.com/sql/oracle-group-rollup
So, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).
First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without grouping
Also have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"
My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();
So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.
If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).
In future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.
Thanks.
Hello
some other info is on http://wiki.postgresql.org/wiki/Grouping_Sets
Maybe in e few weak I'll have some a prototype based on CTE. My older
technique based on hashed tables should be well, but it carry lot of
unshared code. Using CTE means so we can optimize CTE and GROUPING
SETS together. I plan to transform query
SELECT * FROM some GROUP BY GROUPING SETS(a, b)
to
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b
2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello all.
Please, approve my ideas for implementation.Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DOMS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspxMS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspxOracle support this feature:
http://www.compshack.com/sql/oracle-group-rollupSo, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without groupingAlso have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).
This is suboptimal. When SELECT * FROM X GROUP BY GROUPING SETS ...,
where X is some joined data, then you repeat JOIN on every grouping
set. So your solution is simple for implementation, but it should be
really slow.
Regards
Pavel Stehule
Show quoted text
In future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, Pavel.
I read you letter, and found some points:
a) Simple transformation any GROUPING SETS in GROUP BY + UNION ALL
require clone source for every group.
It's so slow.
My point: we can make for start simple implementation.
b) Your sollution it's so good, IMHO
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b
But can you describe me how it's used on planner (calculate cost) and exector.
Sharing source - it's mind - we don't process tree, we process
DIRECTED GRAPH. Hard, many bugs, so on, not?
PostgreSQL support sharing nodes of executor? How? Where i can read?
Next. Ok, we spool source. After we use simple hash group/sort group +
union all? Or more advanced techniques? If different group by - it's
not different from my idea, it's logic additional (spool source for
fast rewind)
c) Don't forget also about node requiments - if we use hash table as
container for source - we reduce some sorting properties - and query
select A,B from TABLE group by ((A,B),(B)) order by a,b require
sorting on output of grouping sets.
It's mind - we need insert sort.
d) We can make SORT ROLLUP and SORT REDUCE ROLLUP.
first - sort group by use sorting for grouping data and calculate aggregations.
So, 'order by A,B,C' also 'order by A,B' also order by 'A' also order by ''
What it's mind? It's mind we can use sort features for implement SORT
ROLLUP. For every sub group we calculate self aggregates... oh, i bad
say. Look on "select a,sum(b) from table group by a". Sort group
grouping by a, and for every a-group calculate aggregations (sum), for
ROLLUP A,B we can make three aggregations ; for A,B than A than () -
and use one path on source calculate aggregations on every step, and
for split source on different subsets.
It's more perfomance, than different group by and union all.
Look for MS SQL:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
Why MS SQL don't support distinct aggregations? I think - because
every distinct aggregations in MS SQL require hash, and many
aggregations - it's so slow.
Thank you!
2009/5/10 Pavel Stehule <pavel.stehule@gmail.com>:
Show quoted text
Hello
some other info is on http://wiki.postgresql.org/wiki/Grouping_Sets
Maybe in e few weak I'll have some a prototype based on CTE. My older
technique based on hashed tables should be well, but it carry lot of
unshared code. Using CTE means so we can optimize CTE and GROUPING
SETS together. I plan to transform querySELECT * FROM some GROUP BY GROUPING SETS(a, b)
to
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello all.
Please, approve my ideas for implementation.Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DOMS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspxMS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspxOracle support this feature:
http://www.compshack.com/sql/oracle-group-rollupSo, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without groupingAlso have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).This is suboptimal. When SELECT * FROM X GROUP BY GROUPING SETS ...,
where X is some joined data, then you repeat JOIN on every grouping
set. So your solution is simple for implementation, but it should be
really slow.Regards
Pavel StehuleIn future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello, Pavel.
I read you letter, and found some points:
a) Simple transformation any GROUPING SETS in GROUP BY + UNION ALL
require clone source for every group.
It's so slow.
My point: we can make for start simple implementation.
b) Your sollution it's so good, IMHO
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b
But can you describe me how it's used on planner (calculate cost) and exector.
look on CTE source code or ask to CTE author.
Sharing source - it's mind - we don't process tree, we process
DIRECTED GRAPH. Hard, many bugs, so on, not?
for example - code for DISTINCT is shared with code for aggregation.
My idea is based on similarity
GROUPING SETS is subset of CTE, nonrecursive CTE is subset of UNION ALL
PostgreSQL support sharing nodes of executor? How? Where i can read?
I don't know, what do you thing exactly sharing nodes? Before CTE I
had to write special holder node, but it isn't necessary now.
Next. Ok, we spool source. After we use simple hash group/sort group +
union all? Or more advanced techniques? If different group by - it's
not different from my idea, it's logic additional (spool source for
fast rewind)
I thing so trivial implementation via UNION ALL should work, but any
optimisations means lot of work - and when you add rewind
functionality, you will got trivial nonrecursive CTE implementation.
When I wrote patch, there wasn't possible an use CTE, so I played wit
own nodes. Now, the more clean solution is probably based on current
CTE base.
c) Don't forget also about node requiments - if we use hash table as
container for source - we reduce some sorting properties - and query
select A,B from TABLE group by ((A,B),(B)) order by a,b require
sorting on output of grouping sets.
It's mind - we need insert sort.
Yes, probably there should be possible some techniques - that should
to eliminate final sort op. I am not sure if it is really important.
Now I thinking what is the bigger devil a) patch complexity b) some
suboptimality (mainly redundant call of agg nodes). For me a.
Aggregates are not gratis, but significantly more expensive is
repeated seq data scan. So it is first goal. Later we should to thing
about next optimalizations.
d) We can make SORT ROLLUP and SORT REDUCE ROLLUP.
first - sort group by use sorting for grouping data and calculate aggregations.
So, 'order by A,B,C' also 'order by A,B' also order by 'A' also order by ''
What it's mind? It's mind we can use sort features for implement SORT
ROLLUP. For every sub group we calculate self aggregates... oh, i bad
say. Look on "select a,sum(b) from table group by a". Sort group
grouping by a, and for every a-group calculate aggregations (sum), for
ROLLUP A,B we can make three aggregations ; for A,B than A than () -
and use one path on source calculate aggregations on every step, and
for split source on different subsets.
It's more perfomance, than different group by and union all.
Maybe, I don't know. What I know:
* UNION ALL will have better result on indexed sets and MIN, MAX
aggregates. CTE isn't optimized now for these cases.
* UNION ALL will be slow for large sets (over cache),
* CTE needs some more optimalizations,
* pg core hackers dislike big and complex patches,
* pg core hackers like any improved current code base.
I am thinking so Grouping Sets based on CTE should be more commitable
code. It doesn't mean so your ideas are wrong, but these
optimalization should to work on CTE too.
select * from table group by rollup(a,b,c)
have to have generate same plan as
with q as (select * from table)
select * from q group by a,b,c
union all
select * from q group by a,b
union all
select * from q group by a
union all
select * from q;
and CTE is more general then Grouping Sets, so it is better do
optimalization over CTE than Grouping Sets.
Regards
Pavel Stehule
Show quoted text
Look for MS SQL:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
Why MS SQL don't support distinct aggregations? I think - because
every distinct aggregations in MS SQL require hash, and many
aggregations - it's so slow.Thank you!
2009/5/10 Pavel Stehule <pavel.stehule@gmail.com>:Hello
some other info is on http://wiki.postgresql.org/wiki/Grouping_Sets
Maybe in e few weak I'll have some a prototype based on CTE. My older
technique based on hashed tables should be well, but it carry lot of
unshared code. Using CTE means so we can optimize CTE and GROUPING
SETS together. I plan to transform querySELECT * FROM some GROUP BY GROUPING SETS(a, b)
to
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello all.
Please, approve my ideas for implementation.Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DOMS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspxMS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspxOracle support this feature:
http://www.compshack.com/sql/oracle-group-rollupSo, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without groupingAlso have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).This is suboptimal. When SELECT * FROM X GROUP BY GROUPING SETS ...,
where X is some joined data, then you repeat JOIN on every grouping
set. So your solution is simple for implementation, but it should be
really slow.Regards
Pavel StehuleIn future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I will write separated mail about rollup.
Can you send me some links or information about "CTE"? What is it?
Also, I need some deep knownledge about postgresql aggregation
calculation (executor part) - can you help me (links, books, name of
source files, etc)?.
After get additional information i will can continue discussion.
Thanls
2009/5/10 Pavel Stehule <pavel.stehule@gmail.com>:
Show quoted text
2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello, Pavel.
I read you letter, and found some points:
a) Simple transformation any GROUPING SETS in GROUP BY + UNION ALL
require clone source for every group.
It's so slow.
My point: we can make for start simple implementation.
b) Your sollution it's so good, IMHO
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b
But can you describe me how it's used on planner (calculate cost) and exector.look on CTE source code or ask to CTE author.
Sharing source - it's mind - we don't process tree, we process
DIRECTED GRAPH. Hard, many bugs, so on, not?for example - code for DISTINCT is shared with code for aggregation.
My idea is based on similarityGROUPING SETS is subset of CTE, nonrecursive CTE is subset of UNION ALL
PostgreSQL support sharing nodes of executor? How? Where i can read?
I don't know, what do you thing exactly sharing nodes? Before CTE I
had to write special holder node, but it isn't necessary now.Next. Ok, we spool source. After we use simple hash group/sort group +
union all? Or more advanced techniques? If different group by - it's
not different from my idea, it's logic additional (spool source for
fast rewind)I thing so trivial implementation via UNION ALL should work, but any
optimisations means lot of work - and when you add rewind
functionality, you will got trivial nonrecursive CTE implementation.
When I wrote patch, there wasn't possible an use CTE, so I played wit
own nodes. Now, the more clean solution is probably based on current
CTE base.c) Don't forget also about node requiments - if we use hash table as
container for source - we reduce some sorting properties - and query
select A,B from TABLE group by ((A,B),(B)) order by a,b require
sorting on output of grouping sets.
It's mind - we need insert sort.Yes, probably there should be possible some techniques - that should
to eliminate final sort op. I am not sure if it is really important.
Now I thinking what is the bigger devil a) patch complexity b) some
suboptimality (mainly redundant call of agg nodes). For me a.
Aggregates are not gratis, but significantly more expensive is
repeated seq data scan. So it is first goal. Later we should to thing
about next optimalizations.d) We can make SORT ROLLUP and SORT REDUCE ROLLUP.
first - sort group by use sorting for grouping data and calculate aggregations.
So, 'order by A,B,C' also 'order by A,B' also order by 'A' also order by ''
What it's mind? It's mind we can use sort features for implement SORT
ROLLUP. For every sub group we calculate self aggregates... oh, i bad
say. Look on "select a,sum(b) from table group by a". Sort group
grouping by a, and for every a-group calculate aggregations (sum), for
ROLLUP A,B we can make three aggregations ; for A,B than A than () -
and use one path on source calculate aggregations on every step, and
for split source on different subsets.
It's more perfomance, than different group by and union all.Maybe, I don't know. What I know:
* UNION ALL will have better result on indexed sets and MIN, MAX
aggregates. CTE isn't optimized now for these cases.
* UNION ALL will be slow for large sets (over cache),
* CTE needs some more optimalizations,
* pg core hackers dislike big and complex patches,
* pg core hackers like any improved current code base.I am thinking so Grouping Sets based on CTE should be more commitable
code. It doesn't mean so your ideas are wrong, but these
optimalization should to work on CTE too.select * from table group by rollup(a,b,c)
have to have generate same plan as
with q as (select * from table)
select * from q group by a,b,c
union all
select * from q group by a,b
union all
select * from q group by a
union all
select * from q;and CTE is more general then Grouping Sets, so it is better do
optimalization over CTE than Grouping Sets.Regards
Pavel StehuleLook for MS SQL:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
Why MS SQL don't support distinct aggregations? I think - because
every distinct aggregations in MS SQL require hash, and many
aggregations - it's so slow.Thank you!
2009/5/10 Pavel Stehule <pavel.stehule@gmail.com>:Hello
some other info is on http://wiki.postgresql.org/wiki/Grouping_Sets
Maybe in e few weak I'll have some a prototype based on CTE. My older
technique based on hashed tables should be well, but it carry lot of
unshared code. Using CTE means so we can optimize CTE and GROUPING
SETS together. I plan to transform querySELECT * FROM some GROUP BY GROUPING SETS(a, b)
to
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello all.
Please, approve my ideas for implementation.Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DOMS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspxMS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspxOracle support this feature:
http://www.compshack.com/sql/oracle-group-rollupSo, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without groupingAlso have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).This is suboptimal. When SELECT * FROM X GROUP BY GROUPING SETS ...,
where X is some joined data, then you repeat JOIN on every grouping
set. So your solution is simple for implementation, but it should be
really slow.Regards
Pavel StehuleIn future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2009/5/11 Pavel Stehule <pavel.stehule@gmail.com>:
I am thinking so Grouping Sets based on CTE should be more commitable
code. It doesn't mean so your ideas are wrong, but these
optimalization should to work on CTE too.select * from table group by rollup(a,b,c)
have to have generate same plan as
with q as (select * from table)
select * from q group by a,b,c
union all
select * from q group by a,b
union all
select * from q group by a
union all
select * from q;and CTE is more general then Grouping Sets, so it is better do
optimalization over CTE than Grouping Sets.
If you need to buffer tuples from the outer plan and to rescan it
multiple times, tuplestore seems more appropriate solution than using
CTE node, from semantic point of view. During CTE and window functions
development, tuplestore now has that kind of capability and CTE node
is only a wrapper of tuplestore.
Moreover, I guess you don't even need to buffer tuples to aggregate by
different keys. What you have to do is only to prepare more than one
hash tables (, or set up sort order if the plan detects hash table is
too large to fit in the memory), and one time seq scan will do. The
trans values are only to be stored in the memory, not the outer plan's
results. It will win greately in performance.
Regards,
--
Hitoshi Harada
2009/5/12 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/5/11 Pavel Stehule <pavel.stehule@gmail.com>:
I am thinking so Grouping Sets based on CTE should be more commitable
code. It doesn't mean so your ideas are wrong, but these
optimalization should to work on CTE too.select * from table group by rollup(a,b,c)
have to have generate same plan as
with q as (select * from table)
select * from q group by a,b,c
union all
select * from q group by a,b
union all
select * from q group by a
union all
select * from q;and CTE is more general then Grouping Sets, so it is better do
optimalization over CTE than Grouping Sets.If you need to buffer tuples from the outer plan and to rescan it
multiple times, tuplestore seems more appropriate solution than using
CTE node, from semantic point of view. During CTE and window functions
development, tuplestore now has that kind of capability and CTE node
is only a wrapper of tuplestore.Moreover, I guess you don't even need to buffer tuples to aggregate by
different keys. What you have to do is only to prepare more than one
hash tables (, or set up sort order if the plan detects hash table is
too large to fit in the memory), and one time seq scan will do. The
trans values are only to be stored in the memory, not the outer plan's
results. It will win greately in performance.
it was my first solution. But I would to prepare one non hash method.
But now I thinking about some special executor node, that fill all
necessary hash parallel. It's special variant of hash agreggate.
regards
Pavel Stehule
Show quoted text
Regards,
--
Hitoshi Harada
On Tue, May 12, 2009 at 2:21 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Moreover, I guess you don't even need to buffer tuples to aggregate by
different keys. What you have to do is only to prepare more than one
hash tables (, or set up sort order if the plan detects hash table is
too large to fit in the memory), and one time seq scan will do. The
trans values are only to be stored in the memory, not the outer plan's
results. It will win greately in performance.it was my first solution. But I would to prepare one non hash method.
But now I thinking about some special executor node, that fill all
necessary hash parallel. It's special variant of hash agreggate.
I think HashAggregate will often be the fastest method of executing
this kind of operation, but it would be nice to have an alternative
(such as repeatedly sorting a tuplestore) to handle non-hashable
datatypes or cases where the HashAggregate would eat too much memory.
But that leads me to a question - does the existing HashAggregate code
make any attempt to obey work_mem? I know that the infrastructure is
present for HashJoin/Hash, but on a quick pass I didn't notice
anything similar in HashAggregate.
And on a slightly off-topic note for this thread, is there any
compelling reason why we have at least three different hash
implementations in the executor? HashJoin/Hash uses one for regular
batches and one for the skew batch, and I believe that HashAggregate
does something else entirely. It seems like it might improve code
maintainability, if nothing else, to unify these to the extent
possible.
...Robert
Hello Oleg
I am sending a new CTE based variant of my GROUPING SETS patch,
this patch has some bugs but it is good prototype (it's more stable
than old patch):
postgres=# select selling_date, baguette, sum(items) from
baguette_selling group by grouping sets(1,2);
selling_date | baguette | sum
--------------+----------+-----
2007-10-30 | | 17
2007-10-31 | | 12
| golf | 9
| buster | 20
(4 rows)
postgres=# select selling_date, baguette, sum(items),
grouping(selling_date), grouping(baguette), grouping_id(selling_date,
baguette) from baguette_selling group by grouping sets(1,2);
selling_date | baguette | sum | grouping | grouping | grouping_id
--------------+----------+-----+----------+----------+-------------
2007-10-30 | | 17 | 1 | 0 | 2
2007-10-31 | | 12 | 1 | 0 | 2
| golf | 9 | 0 | 1 | 1
| buster | 20 | 0 | 1 | 1
(4 rows)
postgres=# select selling_date, baguette, sum(items),
grouping(selling_date), grouping(baguette), grouping_id(selling_date,
baguette) from baguette_selling group by grouping sets(1,2,());
selling_date | baguette | sum | grouping | grouping | grouping_id
--------------+----------+-----+----------+----------+-------------
2007-10-30 | | 17 | 1 | 0 | 2
2007-10-31 | | 12 | 1 | 0 | 2
| golf | 9 | 0 | 1 | 1
| buster | 20 | 0 | 1 | 1
| | 29 | 0 | 0 | 0
(5 rows)
I thing so parser part is well and correct (and ported to 8.4).
CTE works well, but not 100% effective, and will be better to use
direct tuplestore interface (as second technique - when hash tables
can't to be used).
I am thinking, so the best solution is enhancing current Aggregate
node for support of GroupingSets. The code base on UNION ALL is +/-
equal to CTE, and I don't thing, so this should be optimal. But work
freely, please. I have not free time for this patch next two months.
So if you have time, it's your.
regards
Pavel Stehule
2009/5/10 Олег Царев <zabivator@gmail.com>:
Show quoted text
Hello all.
Please, approve my ideas for implementation.Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DOMS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspxMS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspxOracle support this feature:
http://www.compshack.com/sql/oracle-group-rollupSo, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without groupingAlso have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).In future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attachments:
gsets-0.3.difftext/x-patch; charset=US-ASCII; name=gsets-0.3.diffDownload
*** ./src/backend/nodes/copyfuncs.c.orig 2009-05-04 19:54:52.000000000 +0200
--- ./src/backend/nodes/copyfuncs.c 2009-05-04 21:00:33.000000000 +0200
***************
*** 1056,1061 ****
--- 1056,1092 ----
}
/*
+ * _copyGroupingSetsFunc
+ */
+ static GroupingSetsFunc *
+ _copyGroupingSetsFunc(GroupingSetsFunc *from)
+ {
+ GroupingSetsFunc *newnode = makeNode(GroupingSetsFunc);
+
+ COPY_SCALAR_FIELD(identity);
+ COPY_NODE_FIELD(expr);
+ COPY_NODE_FIELD(expr_list);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
+ * _copyGroupingSetsSpec
+ */
+ static GroupingSetsSpec *
+ _copyGroupingSetsSpec(GroupingSetsSpec *from)
+ {
+ GroupingSetsSpec *newnode = makeNode(GroupingSetsSpec);
+
+ COPY_NODE_FIELD(set_list);
+ COPY_SCALAR_FIELD(has_empty_set);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
* _copyScalarArrayOpExpr
*/
static ScalarArrayOpExpr *
***************
*** 4093,4098 ****
--- 4124,4135 ----
case T_XmlSerialize:
retval = _copyXmlSerialize(from);
break;
+ case T_GroupingSetsFunc:
+ retval = _copyGroupingSetsFunc(from);
+ break;
+ case T_GroupingSetsSpec:
+ retval = _copyGroupingSetsSpec(from);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** ./src/backend/nodes/equalfuncs.c.orig 2009-05-04 20:01:09.000000000 +0200
--- ./src/backend/nodes/equalfuncs.c 2009-05-04 20:47:27.000000000 +0200
***************
*** 290,295 ****
--- 290,316 ----
}
static bool
+ _equalGroupingSetsFunc(GroupingSetsFunc *a, GroupingSetsFunc *b)
+ {
+ COMPARE_SCALAR_FIELD(identity);
+ COMPARE_NODE_FIELD(expr);
+ COMPARE_NODE_FIELD(expr_list);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
+ _equalGroupingSetsSpec(GroupingSetsSpec *a, GroupingSetsSpec *b)
+ {
+ COMPARE_SCALAR_FIELD(set_list);
+ COMPARE_SCALAR_FIELD(has_empty_set);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
_equalScalarArrayOpExpr(ScalarArrayOpExpr *a, ScalarArrayOpExpr *b)
{
COMPARE_SCALAR_FIELD(opno);
***************
*** 2871,2876 ****
--- 2892,2903 ----
case T_XmlSerialize:
retval = _equalXmlSerialize(a, b);
break;
+ case T_GroupingSetsFunc:
+ retval = _equalGroupingSetsFunc(a, b);
+ break;
+ case T_GroupingSetsSpec:
+ retval = _equalGroupingSetsSpec(a, b);
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
*** ./src/backend/nodes/nodeFuncs.c.orig 2009-05-04 21:56:15.000000000 +0200
--- ./src/backend/nodes/nodeFuncs.c 2009-05-05 14:05:41.000000000 +0200
***************
*** 917,922 ****
--- 917,928 ----
/* just use argument's location */
loc = exprLocation((Node *) ((PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupingSetsFunc:
+ loc = ((GroupingSetsFunc *) expr)->location;
+ break;
+ case T_GroupingSetsSpec:
+ loc = ((GroupingSetsSpec *) expr)->location;
+ break;
default:
/* for any other node type it's just unknown... */
loc = -1;
***************
*** 1342,1347 ****
--- 1348,1368 ----
break;
case T_PlaceHolderInfo:
return walker(((PlaceHolderInfo *) node)->ph_var, context);
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+
+ if (expression_tree_walker(gsf->expr, walker, context))
+ return true;
+ if (expression_tree_walker((Node *) gsf->expr_list, walker, context))
+ return true;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ if (expression_tree_walker((Node *)((GroupingSetsSpec *) node)->set_list,
+ walker, context))
+ return true;
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
***************
*** 2016,2021 ****
--- 2037,2063 ----
return (Node *) newnode;
}
break;
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ GroupingSetsFunc *newnode;
+
+ FLATCOPY(newnode, gsf, GroupingSetsFunc);
+ MUTATE(newnode->expr, gsf->expr, Node *);
+ MUTATE(newnode->expr_list, gsf->expr, List *);
+ return (Node *) newnode;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) node;
+ GroupingSetsSpec *newnode;
+
+ FLATCOPY(newnode, gss, GroupingSetsSpec);
+ MUTATE(newnode->set_list, gss->set_list, List *);
+ return (Node *) newnode;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
*** ./src/backend/nodes/outfuncs.c.orig 2009-05-04 20:04:05.000000000 +0200
--- ./src/backend/nodes/outfuncs.c 2009-05-04 20:52:47.000000000 +0200
***************
*** 902,907 ****
--- 902,928 ----
}
static void
+ _outGroupingSetsFunc(StringInfo str, GroupingSetsFunc *node)
+ {
+ WRITE_NODE_TYPE("GROUPINGSETSFUNC");
+
+ WRITE_ENUM_FIELD(identity, GroupingSetsFuncIdentity);
+ WRITE_NODE_FIELD(expr);
+ WRITE_NODE_FIELD(expr_list);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
+ _outGroupingSetsSpec(StringInfo str, GroupingSetsSpec *node)
+ {
+ WRITE_NODE_TYPE("GROUPINGSETSSPEC");
+
+ WRITE_NODE_FIELD(set_list);
+ WRITE_BOOL_FIELD(has_empty_set);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
_outScalarArrayOpExpr(StringInfo str, ScalarArrayOpExpr *node)
{
WRITE_NODE_TYPE("SCALARARRAYOPEXPR");
***************
*** 2774,2779 ****
--- 2795,2806 ----
case T_XmlSerialize:
_outXmlSerialize(str, obj);
break;
+ case T_GroupingSetsFunc:
+ _outGroupingSetsFunc(str, obj);
+ break;
+ case T_GroupingSetsSpec:
+ _outGroupingSetsSpec(str, obj);
+ break;
default:
*** ./src/backend/nodes/readfuncs.c.orig 2009-05-04 20:09:32.000000000 +0200
--- ./src/backend/nodes/readfuncs.c 2009-05-04 21:03:35.000000000 +0200
***************
*** 583,588 ****
--- 583,620 ----
}
/*
+ * _readGroupingSetsFunc
+ */
+ static GroupingSetsFunc *
+ _readGroupingSetsFunc(void)
+ {
+ READ_LOCALS(GroupingSetsFunc);
+
+ READ_ENUM_FIELD(identity, GroupingSetsFuncIdentity);
+ READ_NODE_FIELD(expr);
+ READ_NODE_FIELD(expr_list);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+ /*
+ * _readGroupingSetsSpec
+ */
+ static GroupingSetsSpec *
+ _readGroupingSetsSpec(void)
+ {
+ READ_LOCALS(GroupingSetsSpec);
+
+ READ_NODE_FIELD(set_list);
+ READ_BOOL_FIELD(has_empty_set);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+
+ /*
* _readScalarArrayOpExpr
*/
static ScalarArrayOpExpr *
***************
*** 1274,1279 ****
--- 1306,1315 ----
return_value = _readNotifyStmt();
else if (MATCH("DECLARECURSOR", 13))
return_value = _readDeclareCursorStmt();
+ else if (MATCH("GROUPINGSETSFUNC", 16))
+ return_value = _readGroupingSetsFunc();
+ else if (MATCH("GROUPINGSETSSPEC", 16))
+ return_value = _readGroupingSetsSpec();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
*** ./src/backend/parser/analyze.c.orig 2009-05-05 14:14:42.000000000 +0200
--- ./src/backend/parser/analyze.c 2009-05-12 22:32:12.000000000 +0200
***************
*** 34,39 ****
--- 34,40 ----
#include "parser/parse_clause.h"
#include "parser/parse_coerce.h"
#include "parser/parse_cte.h"
+ #include "parser/parse_gsets.h"
#include "parser/parse_oper.h"
#include "parser/parse_relation.h"
#include "parser/parse_target.h"
***************
*** 148,153 ****
--- 149,300 ----
}
/*
+ * process GROUPING SETS
+ */
+ static SelectStmt *
+ makeSelectStmt(List *targetList, List *fromClause)
+ {
+ SelectStmt *n = makeNode(SelectStmt);
+ n->distinctClause = NULL;
+ n->intoClause = NULL;
+ n->targetList = targetList;
+ n->fromClause = fromClause;
+ n->whereClause = NULL;
+ n->groupClause = NULL;
+ n->havingClause = NULL;
+ n->windowClause = NIL;
+ n->withClause = NULL;
+ n->valuesLists = NIL;
+ n->sortClause = NIL;
+ n->limitOffset = NULL;
+ n->limitCount = NULL;
+ n->lockingClause = NIL;
+ n->op = SETOP_NONE;
+ n->all = false;
+ n->larg = NULL;
+ n->rarg = NULL;
+ return n;
+ }
+
+ static List *
+ makeStarTargetList(void)
+ {
+ ResTarget *rt = makeNode(ResTarget);
+
+ rt->name = NULL;
+ rt->indirection = NIL;
+ rt->val = (Node *) makeNode(ColumnRef);
+ ((ColumnRef *) rt->val)->fields = list_make1(makeNode(A_Star));
+ rt->location = -1;
+
+ return list_make1(rt);
+ }
+
+ static SelectStmt *
+ transformGroupingSets(ParseState *pstate, SelectStmt *stmt)
+ {
+ stmt->groupClause = (List *) expandGroupingSets(pstate, stmt->groupClause);
+
+ if (pstate->p_hasGroupingSets)
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) stmt->groupClause;
+ CommonTableExpr *cte = makeNode(CommonTableExpr);
+ SelectStmt *cteedstmt;
+ int ngroupingsets = list_length(gss->set_list) + (gss->has_empty_set ? 1 : 0);
+
+ cteedstmt = makeSelectStmt(NIL, NIL);
+ //list_make1(makeRangeVar(NULL, "**g**", -1)));
+
+ cteedstmt->intoClause = stmt->intoClause;
+
+ cte->ctename = "**g**";
+ cte->ctequery = (Node *) stmt;//(Node *) makeSelectStmt( makeStarTargetList(),
+ // list_make1(makeRangeVar(NULL, "bagety", -1)));
+
+ cte->location = -1;
+
+ cteedstmt->withClause = makeNode(WithClause);
+ cteedstmt->withClause->ctes = list_make1(cte);
+ cteedstmt->withClause->recursive = false;
+ cteedstmt->withClause->location = -1;
+
+ /* when is more than one grouping set, then we should generate setop node */
+ if (ngroupingsets > 1)
+ {
+ /* add quuery under union all for every grouping set */
+ SelectStmt *larg = NULL;
+ SelectStmt *rarg;
+ ListCell *lc;
+
+ foreach(lc, gss->set_list)
+ {
+ List *groupClause;
+
+ Assert(IsA(lfirst(lc), List));
+ groupClause = (List *) lfirst(lc);
+
+ if (larg == NULL)
+ {
+ larg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ larg->groupClause = groupClause;
+ larg->havingClause = copyObject(stmt->havingClause);
+ }
+ else
+ {
+ SelectStmt *setop = makeSelectStmt(NIL, NIL);
+
+ rarg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ rarg->groupClause = groupClause;
+ rarg->havingClause = copyObject(stmt->havingClause);
+
+ setop->op = SETOP_UNION;
+ setop->larg = larg;
+ setop->rarg = rarg;
+ setop->all = true;
+
+ larg = setop;
+ }
+ }
+ if (gss->has_empty_set)
+ {
+ SelectStmt *setop = makeSelectStmt(NIL, NIL);
+
+ rarg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ rarg->havingClause = copyObject(stmt->havingClause);
+
+ setop->op = SETOP_UNION;
+ setop->larg = larg;
+ setop->rarg = rarg;
+ setop->all = true;
+
+ larg = setop;
+ }
+ /* merge larg to result */
+ cteedstmt->op = larg->op;
+ cteedstmt->larg = larg->larg;
+ cteedstmt->rarg = larg->rarg;
+ cteedstmt->all = larg->all;
+ }
+ else
+ {
+ /* there isn't used setop node */
+
+ cteedstmt->targetList = copyObject(stmt->targetList);
+ cteedstmt->fromClause = list_make1(makeRangeVar(NULL, "**g**", -1));
+ }
+
+ ((SelectStmt *)cte->ctequery)->targetList = makeStarTargetList();
+ ((SelectStmt *)cte->ctequery)->groupClause = NIL;
+ return cteedstmt;
+ }
+
+ return stmt;
+ }
+
+ /*
* transformStmt -
* transform a Parse tree into a Query tree.
*/
***************
*** 176,181 ****
--- 323,330 ----
case T_SelectStmt:
{
SelectStmt *n = (SelectStmt *) parseTree;
+
+ n = transformGroupingSets(pstate, n);
if (n->valuesLists)
result = transformValuesClause(pstate, n);
***************
*** 829,835 ****
stmt->groupClause,
&qry->targetList,
qry->sortClause,
! false);
if (stmt->distinctClause == NIL)
{
--- 978,984 ----
stmt->groupClause,
&qry->targetList,
qry->sortClause,
! false);
if (stmt->distinctClause == NIL)
{
*** ./src/backend/parser/gram.y.orig 2009-05-04 18:07:44.000000000 +0200
--- ./src/backend/parser/gram.y 2009-05-05 15:23:01.000000000 +0200
***************
*** 118,123 ****
--- 118,125 ----
static Node *makeNullAConst(int location);
static Node *makeAConst(Value *v, int location);
static Node *makeBoolAConst(bool state, int location);
+ static Node *makeGroupingSetsFunc(GroupingSetsFuncIdentity identity, Node *expr,
+ List *expr_list, int location);
static FuncCall *makeOverlaps(List *largs, List *rargs, int location);
static void check_qualified_name(List *names);
static List *check_func_name(List *names);
***************
*** 416,421 ****
--- 418,426 ----
%type <str> opt_existing_window_name
%type <ival> opt_frame_clause frame_extent frame_bound
+ %type <node> grouping_element empty_grouping_set grouping_sets_spec
+ %type <list> grouping_element_list
+ %type <boolean> opt_grouping_set_quantifier
/*
* If you make any token changes, update the keyword table in
***************
*** 438,444 ****
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
CREATEROLE CREATEUSER CROSS CSV CURRENT_P
! CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
--- 443,449 ----
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
CREATEROLE CREATEUSER CROSS CSV CURRENT_P
! CUBE CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
***************
*** 451,457 ****
FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOLLOWING FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
! GLOBAL GRANT GRANTED GREATEST GROUP_P
HANDLER HAVING HEADER_P HOLD HOUR_P
--- 456,462 ----
FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOLLOWING FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
! GLOBAL GRANT GRANTED GREATEST GROUP_P GROUPING GROUPING_ID
HANDLER HAVING HEADER_P HOLD HOUR_P
***************
*** 485,494 ****
RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
! RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
! SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SHARE
SHOW SIMILAR SIMPLE SMALLINT SOME STABLE STANDALONE_P START STATEMENT
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING SUPERUSER_P
SYMMETRIC SYSID SYSTEM_P
--- 490,499 ----
RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
! RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
! SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SETS SHARE
SHOW SIMILAR SIMPLE SMALLINT SOME STABLE STANDALONE_P START STATEMENT
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING SUPERUSER_P
SYMMETRIC SYSID SYSTEM_P
***************
*** 7227,7234 ****
;
group_clause:
! GROUP_P BY expr_list { $$ = $3; }
! | /*EMPTY*/ { $$ = NIL; }
;
having_clause:
--- 7232,7285 ----
;
group_clause:
! GROUP_P BY opt_grouping_set_quantifier grouping_element_list
! {
! if (!$3)
! ereport(ERROR,
! (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! errmsg("GROUP BY DISTINCT isn't supported")));
! $$ = $4;
! }
! | /*EMPTY*/
! {
! $$ = NIL;
! }
! ;
!
! grouping_sets_spec:
! GROUPING SETS '(' grouping_element_list ')'
! {
! /* We cannot identify and drop empty sets yet. */
! GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
! gss->set_list = $4;
! gss->has_empty_set = false;
! gss->location = @1;
! $$ = (Node *) gss;
! }
! ;
!
! grouping_element_list:
! grouping_element { $$ = list_make1($1); }
! | grouping_element_list ',' grouping_element { $$ = lappend($1, $3); }
! ;
!
! grouping_element:
! a_expr { $$ = $1; }
! | grouping_sets_spec { $$ = $1; }
! | empty_grouping_set { $$ = $1; }
! ;
!
! empty_grouping_set:
! '(' ')'
! {
! $$ = NULL;
! }
! ;
!
! opt_grouping_set_quantifier:
! ALL { $$ = true; }
! | DISTINCT { $$ = false; }
! | /*EMPTY*/ { $$ = true; }
;
having_clause:
***************
*** 9266,9271 ****
--- 9317,9345 ----
n->location = @1;
$$ = (Node *)n;
}
+ | GROUPING '(' a_expr ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_GROUPING, $3, NIL, @1);
+ }
+ | GROUPING_ID '(' expr_list ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_GROUPING_ID, NULL, $3, @1);
+ }
+ | CUBE '(' expr_list ')'
+ {
+ /*
+ * Cube should be used in two different contexts. First,
+ * as part of Grouping Sets specification. Second, as
+ * normal UDF function from contrib cube module. When isnot
+ * grouping sets context, then node is transformated to
+ * FuncCall node later.
+ */
+ $$ = makeGroupingSetsFunc(FUNCTION_CUBE, NULL, $3, @1);
+ }
+ | ROLLUP '(' expr_list ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_ROLLUP, NULL, $3, @1);
+ }
;
/*
***************
*** 10320,10325 ****
--- 10394,10400 ----
| SERVER
| SESSION
| SET
+ | SETS
| SHARE
| SHOW
| SIMPLE
***************
*** 10395,10400 ****
--- 10470,10476 ----
| EXTRACT
| FLOAT_P
| GREATEST
+ | GROUPING_ID
| INOUT
| INT_P
| INTEGER
***************
*** 10489,10494 ****
--- 10565,10571 ----
| COLUMN
| CONSTRAINT
| CREATE
+ | CUBE
| CURRENT_CATALOG
| CURRENT_DATE
| CURRENT_ROLE
***************
*** 10510,10515 ****
--- 10587,10593 ----
| FROM
| GRANT
| GROUP_P
+ | GROUPING
| HAVING
| IN_P
| INITIALLY
***************
*** 10533,10538 ****
--- 10611,10617 ----
| PRIMARY
| REFERENCES
| RETURNING
+ | ROLLUP
| SELECT
| SESSION_USER
| SOME
***************
*** 11047,11052 ****
--- 11126,11143 ----
return (Node *) x;
}
+ static Node *
+ makeGroupingSetsFunc(GroupingSetsFuncIdentity identity, Node *expr, List *expr_list, int location)
+ {
+ GroupingSetsFunc *gsfunc = makeNode(GroupingSetsFunc);
+
+ gsfunc->identity = identity;
+ gsfunc->expr = expr;
+ gsfunc->expr_list = expr_list;
+ gsfunc->location = location;
+ return (Node *) gsfunc;
+ }
+
/* parser_init()
* Initialize to parse one query string
*/
*** ./src/backend/parser/Makefile.orig 2009-05-04 21:23:31.000000000 +0200
--- ./src/backend/parser/Makefile 2009-05-04 21:24:06.000000000 +0200
***************
*** 14,20 ****
OBJS= analyze.o gram.o keywords.o parser.o parse_agg.o parse_cte.o parse_clause.o \
parse_expr.o parse_func.o parse_node.o parse_oper.o parse_relation.o \
! parse_type.o parse_coerce.o parse_target.o parse_utilcmd.o scansup.o kwlookup.o
FLEXFLAGS = -CF
--- 14,21 ----
OBJS= analyze.o gram.o keywords.o parser.o parse_agg.o parse_cte.o parse_clause.o \
parse_expr.o parse_func.o parse_node.o parse_oper.o parse_relation.o \
! parse_type.o parse_coerce.o parse_target.o parse_utilcmd.o scansup.o kwlookup.o \
! parse_gsets.o
FLEXFLAGS = -CF
*** ./src/backend/parser/parse_agg.c.orig 2009-05-11 21:12:59.000000000 +0200
--- ./src/backend/parser/parse_agg.c 2009-05-12 22:30:37.000000000 +0200
***************
*** 32,42 ****
--- 32,51 ----
int sublevels_up;
} check_ungrouped_columns_context;
+ typedef struct
+ {
+ ParseState *pstate;
+ List *groupClause;
+ } transform_ungrouped_target_context;
+
static void check_ungrouped_columns(Node *node, ParseState *pstate,
List *groupClauses, bool have_non_var_grouping);
static bool check_ungrouped_columns_walker(Node *node,
check_ungrouped_columns_context *context);
+ static Node * transform_ungrouped_target(Node *node, ParseState *pstate,
+ List *groupClauses);
+
/*
* transformAggregateCall -
***************
*** 318,324 ****
* WINDOW clauses. For that matter, it's also going to examine the
* grouping expressions themselves --- but they'll all pass the test ...
*/
! clause = (Node *) qry->targetList;
if (hasJoinRTEs)
clause = flatten_join_alias_vars(root, clause);
check_ungrouped_columns(clause, pstate,
--- 327,344 ----
* WINDOW clauses. For that matter, it's also going to examine the
* grouping expressions themselves --- but they'll all pass the test ...
*/
! if (pstate->parentParseState && pstate->parentParseState->p_hasGroupingSets)
! {
! clause = (Node *) transform_ungrouped_target((Node *) qry->targetList,
! pstate,
! groupClauses);
! /* HACK!!! - move to transform part*/
! qry->targetList = clause;
! }
! else
! clause = (Node *) qry->targetList;
!
!
if (hasJoinRTEs)
clause = flatten_join_alias_vars(root, clause);
check_ungrouped_columns(clause, pstate,
***************
*** 426,431 ****
--- 446,517 ----
}
/*
+ * transform_ungrouped_cols_mutator -
+ * All non aggregates non constatnt columns are replaced by NULL,
+ * grouping and grouping_id functions are replaced by constatnts.
+ */
+ static Node *
+ transform_ungrouped_target_mutator(Node *node,
+ transform_ungrouped_target_context *context)
+ {
+ if (node == NULL)
+ return NULL;
+ if (IsA(node, Aggref))
+ return node;
+
+ if (IsA(node, GroupingSetsFunc))
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ int result = 0;
+
+ if (gsf->identity == FUNCTION_GROUPING)
+ {
+ result = list_member(context->groupClause, gsf->expr);
+ return (Node *) make_const(context->pstate, makeInteger(result), gsf->location);
+ }
+ else if (gsf->identity == FUNCTION_GROUPING_ID)
+ {
+ ListCell *el;
+
+ foreach(el, gsf->expr_list)
+ {
+ result = result << 1;
+ if (list_member(context->groupClause, lfirst(el)))
+ result = result | 0x01;
+ }
+ return (Node *) make_const(context->pstate, makeInteger(result), gsf->location);
+ }
+ else /* replace Cube and Rollup by FuncCall node */
+ {
+ /* ToDo: Support cube function */
+ }
+ }
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ if (list_member(context->groupClause, node))
+ return node;
+ else
+ return (Node *) makeNullConst(var->vartype, var->vartypmod);
+ }
+
+ return expression_tree_mutator(node,
+ transform_ungrouped_target_mutator, context);
+ }
+
+ static Node *
+ transform_ungrouped_target(Node *node, ParseState *pstate,
+ List *groupClauses)
+ {
+ transform_ungrouped_target_context context;
+
+ context.pstate = pstate;
+ context.groupClause = groupClauses;
+ return transform_ungrouped_target_mutator(node, &context);
+ }
+ /*
* check_ungrouped_columns -
* Scan the given expression tree for ungrouped variables (variables
* that are not listed in the groupClauses list and are not within
*** ./src/backend/parser/parse_expr.c.orig 2009-05-12 22:33:29.000000000 +0200
--- ./src/backend/parser/parse_expr.c 2009-05-12 22:53:56.000000000 +0200
***************
*** 273,278 ****
--- 273,296 ----
case T_CurrentOfExpr:
result = transformCurrentOfExpr(pstate, (CurrentOfExpr *) expr);
break;
+
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) expr;
+ ListCell *lc;
+ List *expr_list = NIL;
+
+ gsf->expr = (Node *) transformExpr(pstate, (Node *) gsf->expr);
+
+ foreach(lc, gsf->expr_list)
+ {
+ expr_list = lappend(expr_list, transformExpr(pstate,
+ (Node *) lfirst(lc)));
+ }
+ gsf->expr_list = expr_list;
+ result = expr;
+ break;
+ }
/*********************************************
* Quietly accept node types that may be presented when we are
*** ./src/backend/parser/parse_gsets.c.orig 2009-05-04 21:13:54.000000000 +0200
--- ./src/backend/parser/parse_gsets.c 2009-05-12 22:12:34.000000000 +0200
***************
*** 0 ****
--- 1,679 ----
+ /*-------------------------------------------------------------------------
+ *
+ * parse_gsets.c
+ * parser grouping sets transformations
+ *
+ * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * $PostgreSQL: $
+ *
+ *-------------------------------------------------------------------------
+ */
+ #include "postgres.h"
+ #include "catalog/namespace.h"
+ #include "nodes/nodes.h"
+ #include "nodes/makefuncs.h"
+ #include "nodes/nodeFuncs.h"
+ #include "nodes/value.h"
+ #include "nodes/pg_list.h"
+ #include "nodes/parsenodes.h"
+ #include "optimizer/tlist.h"
+ #include "parser/parse_clause.h"
+ #include "parser/parse_gsets.h"
+ #include "parser/parse_node.h"
+ #include "rewrite/rewriteManip.h"
+
+ #include <string.h>
+
+
+ #define UNDEF_LOCATION -1
+
+
+ #define ORDER_CLAUSE 0
+ #define GROUP_CLAUSE 1
+ #define DISTINCT_ON_CLAUSE 2
+
+ #define MAX_CUBE_FIELDS 8
+
+ #define IsGroupingSetsSpec(n) (IsA(n, GroupingSetsSpec))
+ #define IsUndefLoc(n) (n == UNDEF_LOCATION)
+ #define IsNIL(n) (n == NIL)
+
+ typedef struct
+ {
+ ParseState *pstate;
+ List *groupClause;
+ bool isGlobalTL;
+ } transform_ungroup_cols_context;
+
+ static List *add_distinct_set(List *gsets_list, List *gsets);
+ static List *add_distinct_sets(List *cum_gsets_list, List *gsets_list);
+ static Node *set_multiplication(List *list, bool has_empty_set);
+ static Node *expandGroupingSetsOperator(ParseState *pstate, Node *node, bool *isemptyset);
+
+
+ /*
+ * return true, when GroupingSet is empty
+ */
+ static bool
+ IsEmptyGroupingSet(Node *node)
+ {
+ if (node == NULL)
+ return true;
+
+ if (IsA(node, GroupingSetsSpec))
+ return IsNIL(((GroupingSetsSpec *) node)->set_list);
+
+ return false;
+ }
+
+ /*
+ * expandGroupingSets do:
+ * a) search end replace cube and rollup funccall nodes by GroupingSetsSpec
+ * nodes,
+ * b) multiply all set in GroupBy list,
+ * c) drop all None nodes.
+ *
+ */
+ Node *
+ expandGroupingSets(ParseState *pstate, List *grouplist)
+ {
+ ListCell *g;
+ List *result = NIL;
+ bool has_empty_set = false;
+ int location = UNDEF_LOCATION;
+ bool is_grouping_sets = false;
+
+ /* leave fast when grouplist is NIL */
+ if (IsNIL(grouplist))
+ return NULL;
+
+ /* detect single group by grouping set */
+ foreach(g, grouplist)
+ {
+ Node *el = (Node *) lfirst(g);
+ Node *expel;
+ bool isemptyset;
+
+ /*
+ * Is it empty set. NULL is short cut for empty set.
+ * This has disadvantage - we lost location, but it is short.
+ */
+ if (IsEmptyGroupingSet(el))
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ /* take location from first item of list */
+ if (IsUndefLoc(location))
+ location = exprLocation(el);
+
+ expel = expandGroupingSetsOperator(pstate, el, &isemptyset);
+
+ if (isemptyset)
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ if (IsA(el, GroupingSetsSpec))
+ is_grouping_sets = true;
+ else
+ {
+ switch (((GroupingSetsFunc *) el)->identity)
+ {
+ case FUNCTION_CUBE:
+ case FUNCTION_ROLLUP:
+ is_grouping_sets = true;
+ break;
+ default:
+ is_grouping_sets = false;
+ }
+ }
+
+ result = lappend(result, expel);
+ }
+
+ /*
+ * when isn't detected grouping sets, but is any empty set
+ * do implicit Grouping Set
+ */
+ if (!is_grouping_sets)
+ {
+ if (has_empty_set)
+ {
+ GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
+ gss->set_list = result;
+ gss->has_empty_set = true;
+ gss->location = location;
+ pstate->p_hasGroupingSets = true;
+
+ return (Node *) gss;
+ }
+ else
+ /* when no trick from grouping sets spec is used, return original list */
+ return (Node *) grouplist;
+ }
+
+ /* multiple all sets in GROUP BY list */
+ pstate->p_hasGroupingSets = true;
+
+ return (Node *) set_multiplication(result, has_empty_set);
+ }
+
+ /*
+ * This function has some purposes - first take list from row expression,
+ * second - remove all empty sets and remove all duplicate sets,
+ * ToDo: add flag to RowExpr that carry info about explicit ROW keyword
+ * using.
+ */
+ static List **
+ adjustElements(List *elements, int *nfields, bool *has_empty_set)
+ {
+ ListCell *l;
+ List *felements = NIL; /* filtered elements, without empty sets */
+ int _nfields;
+ List **result;
+ int i = 0;
+
+ *has_empty_set = false;
+ foreach(l, elements)
+ {
+ if (IsEmptyGroupingSet((Node *) lfirst(l)))
+ {
+ *has_empty_set = true;
+ continue;
+ }
+
+ felements = list_append_unique(felements, lfirst(l));
+ }
+
+ if (IsNIL(felements))
+ return NULL;
+
+ _nfields = list_length(felements);
+ result = palloc(_nfields * sizeof(List *));
+
+ foreach(l, felements)
+ {
+ Node *el = (Node *) lfirst(l);
+
+ switch (nodeTag(el))
+ {
+ case T_A_Expr:
+ case T_ColumnRef:
+ result[i++] = list_make1(el);
+ break;
+ case T_RowExpr:
+ result[i++] = ((RowExpr *) el)->args;
+ break;
+ default:
+ elog(ERROR, "unexpected node type %d", nodeTag(el));
+ }
+ }
+
+ *nfields = _nfields;
+ return result;
+ }
+
+ /*
+ * This function add one grouping set to other. Result is list of distinct sets,
+ * etc list of distinct list.
+ *
+ * Note: the distinct in function name is not related to ALL|DISTINCT quantifier
+ * of GROUP BY clause
+ */
+ static List *
+ add_distinct_set(List *gsets_list, List *gsets)
+ {
+
+ List *uniqset = NIL; /* unique gsets */
+ ListCell *l;
+ int act_len;
+ bool found = false;
+
+ /*
+ * ensure so all fields in gsets_list are unique,
+ * all new items are unique there.
+ */
+ foreach (l, gsets)
+ uniqset = list_append_unique(uniqset, lfirst(l));
+
+ /* same lists has same length */
+ act_len = list_length(uniqset);
+
+ foreach(l, gsets_list)
+ {
+ Node *node = (Node *) lfirst(l);
+
+ Assert(IsA(node, List));
+ if (list_length((List *) node) == act_len)
+ {
+ ListCell *lc;
+
+ found = true;
+ foreach(lc, (List *) node)
+ if (!list_member(uniqset, (Node *) lfirst(lc)))
+ {
+ found = false;
+ break;
+ }
+ if (found)
+ break;
+ }
+ }
+
+ if (!found)
+ return lappend(gsets_list, uniqset);
+ else
+ return gsets_list;
+ }
+
+ /*
+ * This used in grouping sets(a, rollup(a,b))
+ */
+ static List *
+ add_distinct_sets(List *cum_gsets_list, List *gsets_list)
+ {
+ ListCell *l;
+
+ foreach(l, gsets_list)
+ {
+ Node *el = (Node *) lfirst(l);
+ Assert(IsA(el, List));
+ cum_gsets_list = add_distinct_set(cum_gsets_list, (List *) el);
+ }
+ return cum_gsets_list;
+ }
+
+ /*
+ * Evaluate CUBE and ROLLUP operators
+ * Transform from GroupingSetsFunc to GroupingSetsSpec.
+ *
+ * Note: This routine doesn't necessary replace all GroupingSetsFunc nodes. It evaluate only
+ * GROUPING SETS context. There can be any nodes elsewhere - mainly Cube as FuncCall, so
+ * we have to use walker later and check it (and possible replace it).
+ */
+ static Node *
+ expandGroupingSetsOperator(ParseState *pstate, Node *node, bool *isemptyset)
+ {
+ bool has_empty_set = false;
+ List *set_list = NIL;
+ GroupingSetsSpec *result;
+ int location;
+
+ if (IsEmptyGroupingSet(node))
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ *isemptyset = false;
+
+ switch (nodeTag(node))
+ {
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ List **sets;
+ List *sets_union;
+ int nfields = list_length(gsf->expr_list);
+
+ if (gsf->identity == FUNCTION_CUBE && nfields > MAX_CUBE_FIELDS)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("more than eight fields in CUBE operator"),
+ parser_errposition(pstate, gsf->location)));
+
+ sets = adjustElements(gsf->expr_list, &nfields, &has_empty_set);
+ if (sets == NULL)
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ switch (gsf->identity)
+ {
+ case FUNCTION_CUBE:
+ {
+ /*
+ * Grouping Sets CUBE operator
+ * This operator evaluate set {a,b} to sets {{a,b},{a},{b},{}}
+ *
+ * An combinations are generated with bitmap's shift
+ * like
+ * 111 7 {a,b,c}
+ * 110 6 {a,b}
+ * 101 5 {a,c}
+ * 100 4 {a}
+ * 011 3 {b,c}
+ * 010 2 {b}
+ * 001 1 {c}
+ * 000 0 empty set
+ */
+ unsigned int btab[] = {0x01, 0x03, 0x07, 0x0f, 0x1f, 0x3f, 0x7f, 0xff};
+ unsigned int bitmap = btab[nfields - 1];
+ unsigned int masc = bitmap;
+ unsigned int selector = 1 << (nfields - 1);
+ int i;
+
+ sets_union = NIL;
+ while (bitmap > 0)
+ {
+ unsigned int processed_bitmap = bitmap--;
+
+ sets_union = NIL;
+ for (i = 0; processed_bitmap > 0; i++)
+ {
+ if (processed_bitmap & selector)
+ sets_union = list_union(sets_union, sets[i]);
+
+ processed_bitmap = (processed_bitmap << 1) & masc;
+ }
+
+ set_list = lappend(set_list, sets_union);
+ }
+ }
+ break;
+
+ case FUNCTION_ROLLUP:
+ {
+ /*
+ * Grouping Sets ROLLUP operator
+ * This operator evaluate set {a,b} to sets {{a,b},{a},{}}
+ */
+ int i;
+ int j;
+
+ set_list = NIL;
+ for (i = 0; i < nfields; i++)
+ {
+ sets_union = NIL;
+ for (j = 0; j < nfields - i; j++)
+ sets_union = list_union(sets_union, (List *) sets[j]);
+
+ set_list = lappend(set_list, sets_union);
+ }
+ }
+ break;
+
+ default:
+ /*
+ * none here, for FUNCTION_GROUPING and FUNCTION_GROUPING_ID
+ * return original node
+ */
+ pfree(sets);
+
+ return node;
+ }
+
+ pfree(sets);
+ location = gsf->location;
+ has_empty_set = true;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ {
+ /*
+ * Grouping Sets (xxxx)
+ * a) call evaluation CUBE and ROLLUP operator
+ * b) ensure distinct sets
+ * c) find empty set inside
+ */
+ ListCell *gse;
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) node;
+
+ set_list = NIL;
+ location = gss->location;
+
+ foreach(gse, gss->set_list)
+ {
+ Node *grouping_element = (Node *) lfirst(gse);
+ Node *expanded_set = expandGroupingSetsOperator(pstate,
+ grouping_element, isemptyset);
+
+ if (*isemptyset)
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ switch (nodeTag(expanded_set))
+ {
+ case T_GroupingSetsSpec:
+ {
+ GroupingSetsSpec *egss = (GroupingSetsSpec *) expanded_set;
+
+ if (egss->has_empty_set)
+ has_empty_set = true;
+
+ if (egss->set_list != NIL)
+ set_list = add_distinct_sets(set_list, egss->set_list);
+ }
+ break;
+
+ case T_RowExpr:
+ {
+ /*
+ * parser recognizes in grouping set(a,b,(c,b)) row(c,b),
+ * so is necessary to transform to GroupingSetsSpec
+ */
+ ListCell *l;
+ RowExpr *rexpr = (RowExpr *) expanded_set;
+
+ foreach(l, rexpr->args)
+ if (IsEmptyGroupingSet((Node *)lfirst(l)))
+ {
+ has_empty_set = true;
+ break;
+ }
+
+ set_list = add_distinct_set(set_list, rexpr->args);
+ }
+ break;
+
+ default:
+ set_list = add_distinct_set(set_list, list_make1(expanded_set));
+ }
+ }
+ }
+ break;
+
+ default:
+ return node;
+ }
+
+ if (set_list == NIL)
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ result = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+ result->has_empty_set = has_empty_set;
+ result->set_list = set_list;
+ result->location = location;
+ /* this grouping set isn't empty */
+ *isemptyset = false;
+
+ return (Node *) result;
+ }
+
+ /* multiple two nodes:
+ * GroupingSetsSpec x GroupingSetsSpec
+ * Node * Node
+ * Node * GroupingSetsSpec
+ * GroupingSetsSpec * Node
+ */
+ static Node *
+ multiple(Node *a, Node *b)
+ {
+ ListCell *l;
+ ListCell *lj;
+ List *set_list = NIL;
+ GroupingSetsSpec *result;
+ GroupingSetsSpec *gss_a;
+ GroupingSetsSpec *gss_b;
+
+ /* leave when nodes are equal */
+ if (equal(a, b))
+ return a;
+
+ if (!IsGroupingSetsSpec(a) && !IsGroupingSetsSpec(b))
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+ gss->set_list = list_make1(list_make2(a,b));
+ gss->has_empty_set = false;
+ gss->location = exprLocation(a);
+
+ return (Node *) gss;
+ }
+ if (IsGroupingSetsSpec(a) && !IsGroupingSetsSpec(b))
+ {
+ Node *aux = a;
+
+ a = b; b = aux;
+ }
+ if (!IsGroupingSetsSpec(a) && IsGroupingSetsSpec(b))
+ {
+ gss_b = (GroupingSetsSpec *) b;
+
+ foreach(l, gss_b->set_list)
+ {
+ List *set = (List *) lfirst(l);
+
+ Assert(IsA(lfirst(l), List));
+
+ /* add item to all grouping sets */
+ set = list_append_unique(set, a);
+ set_list = add_distinct_set(set_list, set);
+ }
+
+ if (gss_b->has_empty_set)
+ {
+ set_list = add_distinct_set(set_list, (list_make1(a)));
+ gss_b->has_empty_set = false;
+ }
+
+ gss_b->set_list = set_list;
+
+ return (Node *) b;
+ }
+
+ /*
+ * ((A,B),(C)) * ((X,Y),()) = ((A,B,X,Y),(A,B),(C,X,Y),(C))
+ */
+ Assert(IsGroupingSetsSpec(a) && IsGroupingSetsSpec(b));
+
+ gss_a = (GroupingSetsSpec *) a;
+ gss_b = (GroupingSetsSpec *) b;
+
+ result = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+
+ foreach(l, gss_a->set_list)
+ {
+ foreach(lj, gss_b->set_list)
+ set_list = add_distinct_set(set_list,
+ list_union((List *) lfirst(l),
+ (List *) lfirst(lj)));
+ if (gss_b->has_empty_set)
+ set_list = add_distinct_set(set_list,
+ (List *) lfirst(l));
+ }
+
+ if (gss_a->has_empty_set)
+ foreach(l, gss_b->set_list)
+ set_list = add_distinct_set(set_list, (List *) lfirst(l));
+
+ result->set_list = set_list;
+ result->has_empty_set = gss_a->has_empty_set && gss_b->has_empty_set;
+ result->location = gss_a->location;
+
+ return (Node *) result;
+ }
+
+ /*
+ * multiply elements of list -> (a1,a2,a2,a4) => (((a1 x a2) x a3) x a4)
+ */
+ static Node *
+ set_multiplication(List *list, bool has_empty_set)
+ {
+ int nfields = list_length(list);
+ ListCell *lc;
+ bool first_iter = true;
+ Node *stack = NULL;
+
+ foreach(lc, list)
+ {
+ Node *el = (Node *) lfirst(lc);
+
+ /* Propagate info about empty set to first item */
+ if (first_iter)
+ {
+ if (has_empty_set)
+ {
+ if (IsGroupingSetsSpec(el))
+ {
+ ((GroupingSetsSpec *) el)->has_empty_set = true;
+ /* leave when list has only one field */
+ if (nfields == 1)
+ return el;
+
+ stack = el;
+ }
+ else
+ {
+ GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
+ gss->set_list = list_make1(list_make1(el));
+ gss->has_empty_set = true;
+ gss->location = exprLocation(el);
+
+ stack = (Node *) gss;
+
+ /* leave when list has only one field */
+ if (nfields == 1)
+ return stack;
+ }
+ }
+ else
+ stack = el;
+
+ first_iter = false;
+ /* go to for next item */
+ continue;
+ }
+
+ Assert(stack != NULL);
+ stack = multiple(stack, el);
+ }
+
+ return stack;
+ }
+
+ /*
+ * This procedure prepare global targetList, local targetLists and local
+ * groupClauses.
+ */
+ List *
+ transformGroupingSetsSpec(ParseState *pstate, List *groupClause,
+ List **groupClauses, List **targetLists,
+ List **targetList, List *sortClause)
+ {
+ List *globalTL = copyObject((Node *)*targetList);
+ List *globalGC = NIL;
+ List *gpcs = NIL;
+ List *tls = NIL;
+ ListCell *gsc;
+ GroupingSetsSpec *gsspec = (GroupingSetsSpec *) linitial(groupClause);
+
+ *groupClauses = gpcs;
+ *targetLists = tls;
+ *targetList = globalTL;
+
+ return globalGC;
+ }
+
*** ./src/backend/parser/parse_target.c.orig 2009-05-04 21:16:40.000000000 +0200
--- ./src/backend/parser/parse_target.c 2009-05-04 21:22:30.000000000 +0200
***************
*** 1446,1451 ****
--- 1446,1467 ----
case T_XmlSerialize:
*name = "xmlserialize";
return 2;
+ case T_GroupingSetsFunc:
+ switch (((GroupingSetsFunc *) node)->identity)
+ {
+ case FUNCTION_GROUPING:
+ *name = "grouping";
+ return 2;
+ case FUNCTION_GROUPING_ID:
+ *name = "grouping_id";
+ return 2;
+ case FUNCTION_CUBE: /* by compiler quite */
+ case FUNCTION_ROLLUP:
+ /* nothing */
+ break;
+ }
+ break;
+
default:
break;
}
*** ./src/include/nodes/nodes.h.orig 2009-05-04 19:32:41.000000000 +0200
--- ./src/include/nodes/nodes.h 2009-05-04 20:46:20.000000000 +0200
***************
*** 376,381 ****
--- 376,383 ----
T_XmlSerialize,
T_WithClause,
T_CommonTableExpr,
+ T_GroupingSetsFunc,
+ T_GroupingSetsSpec,
/*
* TAGS FOR RANDOM OTHER STUFF
*** ./src/include/nodes/parsenodes.h.orig 2009-05-04 19:15:28.000000000 +0200
--- ./src/include/nodes/parsenodes.h 2009-05-04 20:47:42.000000000 +0200
***************
*** 374,379 ****
--- 374,390 ----
} SortBy;
/*
+ * GroupingSetsSpec - for GROUP BY GROUPING SETS clause
+ */
+ typedef struct GroupingSetsSpec
+ {
+ NodeTag type;
+ List *set_list;
+ bool has_empty_set; /* true when grouping sets contains empty set */
+ int location;
+ } GroupingSetsSpec;
+
+ /*
* WindowDef - raw representation of WINDOW and OVER clauses
*
* For entries in a WINDOW list, "name" is the window name being defined.
***************
*** 416,421 ****
--- 427,452 ----
FRAMEOPTION_END_CURRENT_ROW)
/*
+ * GroupingSetsFunc - parser node for Grouping, Grouping_id, Cube and Rollup quasy functions
+ */
+ typedef enum GroupingSetsFuncIdentity
+ {
+ FUNCTION_GROUPING,
+ FUNCTION_GROUPING_ID,
+ FUNCTION_CUBE,
+ FUNCTION_ROLLUP
+ } GroupingSetsFuncIdentity;
+
+ typedef struct GroupingSetsFunc
+ {
+ NodeTag type;
+ GroupingSetsFuncIdentity identity;
+ Node *expr;
+ List *expr_list;
+ int location;
+ } GroupingSetsFunc;
+
+ /*
* RangeSubselect - subquery appearing in a FROM clause
*/
typedef struct RangeSubselect
*** ./src/include/parser/kwlist.h.orig 2009-05-04 18:08:32.000000000 +0200
--- ./src/include/parser/kwlist.h 2009-05-05 14:44:39.000000000 +0200
***************
*** 98,103 ****
--- 98,104 ----
PG_KEYWORD("createuser", CREATEUSER, UNRESERVED_KEYWORD)
PG_KEYWORD("cross", CROSS, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("csv", CSV, UNRESERVED_KEYWORD)
+ PG_KEYWORD("cube", CUBE, RESERVED_KEYWORD)
PG_KEYWORD("current", CURRENT_P, UNRESERVED_KEYWORD)
PG_KEYWORD("current_catalog", CURRENT_CATALOG, RESERVED_KEYWORD)
PG_KEYWORD("current_date", CURRENT_DATE, RESERVED_KEYWORD)
***************
*** 168,173 ****
--- 169,176 ----
PG_KEYWORD("granted", GRANTED, UNRESERVED_KEYWORD)
PG_KEYWORD("greatest", GREATEST, COL_NAME_KEYWORD)
PG_KEYWORD("group", GROUP_P, RESERVED_KEYWORD)
+ PG_KEYWORD("grouping", GROUPING, RESERVED_KEYWORD)
+ PG_KEYWORD("grouping_id", GROUPING_ID, COL_NAME_KEYWORD)
PG_KEYWORD("handler", HANDLER, UNRESERVED_KEYWORD)
PG_KEYWORD("having", HAVING, RESERVED_KEYWORD)
PG_KEYWORD("header", HEADER_P, UNRESERVED_KEYWORD)
***************
*** 317,322 ****
--- 320,326 ----
PG_KEYWORD("right", RIGHT, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("role", ROLE, UNRESERVED_KEYWORD)
PG_KEYWORD("rollback", ROLLBACK, UNRESERVED_KEYWORD)
+ PG_KEYWORD("rollup", ROLLUP, RESERVED_KEYWORD)
PG_KEYWORD("row", ROW, COL_NAME_KEYWORD)
PG_KEYWORD("rows", ROWS, UNRESERVED_KEYWORD)
PG_KEYWORD("rule", RULE, UNRESERVED_KEYWORD)
***************
*** 334,339 ****
--- 338,344 ----
PG_KEYWORD("session_user", SESSION_USER, RESERVED_KEYWORD)
PG_KEYWORD("set", SET, UNRESERVED_KEYWORD)
PG_KEYWORD("setof", SETOF, COL_NAME_KEYWORD)
+ PG_KEYWORD("sets", SETS, UNRESERVED_KEYWORD)
PG_KEYWORD("share", SHARE, UNRESERVED_KEYWORD)
PG_KEYWORD("show", SHOW, UNRESERVED_KEYWORD)
PG_KEYWORD("similar", SIMILAR, TYPE_FUNC_NAME_KEYWORD)
*** ./src/include/parser/parse_gsets.h.orig 2009-05-04 21:41:52.000000000 +0200
--- ./src/include/parser/parse_gsets.h 2009-05-11 11:22:57.000000000 +0200
***************
*** 0 ****
--- 1,14 ----
+ #include "parser/parse_node.h"
+ #include "nodes/pg_list.h"
+
+
+ #ifndef PARSE_GETS_H
+ #define PARSE_GETS_H
+
+ Node *expandGroupingSets(ParseState *pstate, List *grouplist);
+ List *transformGroupingSetsSpec(ParseState *pstate, List *groupClause,
+ List **groupClauses, List **targetLists,
+ List **targetList, List *sortClause);
+
+
+ #endif /* PARSE_GETS_H */
*** ./src/include/parser/parse_node.h.orig 2009-05-11 20:31:38.000000000 +0200
--- ./src/include/parser/parse_node.h 2009-05-11 20:32:05.000000000 +0200
***************
*** 101,106 ****
--- 101,107 ----
bool p_is_insert;
bool p_is_update;
Relation p_target_relation;
+ bool p_hasGroupingSets;
RangeTblEntry *p_target_rangetblentry;
} ParseState;
Hello all!
If no one objecte (all agree, in other say) i continue work on patch -
particulary, i want support second strategy (tuple store instead of
hash-table) for save order of source (more cheap solution in case with
grouping sets + order by), investigate and brainstorm another
optimisation, writing regression tests and technical documentation.
But I need some time for complete my investigation internals of
PostgreSQL, particulary CTE.
Thanks.
2009/5/13 Pavel Stehule <pavel.stehule@gmail.com>:
Show quoted text
Hello Oleg
I am sending a new CTE based variant of my GROUPING SETS patch,
this patch has some bugs but it is good prototype (it's more stable
than old patch):postgres=# select selling_date, baguette, sum(items) from
baguette_selling group by grouping sets(1,2);
selling_date | baguette | sum
--------------+----------+-----
2007-10-30 | | 17
2007-10-31 | | 12
| golf | 9
| buster | 20
(4 rows)postgres=# select selling_date, baguette, sum(items),
grouping(selling_date), grouping(baguette), grouping_id(selling_date,
baguette) from baguette_selling group by grouping sets(1,2);
selling_date | baguette | sum | grouping | grouping | grouping_id
--------------+----------+-----+----------+----------+-------------
2007-10-30 | | 17 | 1 | 0 | 2
2007-10-31 | | 12 | 1 | 0 | 2
| golf | 9 | 0 | 1 | 1
| buster | 20 | 0 | 1 | 1
(4 rows)postgres=# select selling_date, baguette, sum(items),
grouping(selling_date), grouping(baguette), grouping_id(selling_date,
baguette) from baguette_selling group by grouping sets(1,2,());
selling_date | baguette | sum | grouping | grouping | grouping_id
--------------+----------+-----+----------+----------+-------------
2007-10-30 | | 17 | 1 | 0 | 2
2007-10-31 | | 12 | 1 | 0 | 2
| golf | 9 | 0 | 1 | 1
| buster | 20 | 0 | 1 | 1
| | 29 | 0 | 0 | 0
(5 rows)I thing so parser part is well and correct (and ported to 8.4).
CTE works well, but not 100% effective, and will be better to use
direct tuplestore interface (as second technique - when hash tables
can't to be used).I am thinking, so the best solution is enhancing current Aggregate
node for support of GroupingSets. The code base on UNION ALL is +/-
equal to CTE, and I don't thing, so this should be optimal. But work
freely, please. I have not free time for this patch next two months.
So if you have time, it's your.regards
Pavel Stehule2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello all.
Please, approve my ideas for implementation.Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DOMS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspxMS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspxOracle support this feature:
http://www.compshack.com/sql/oracle-group-rollupSo, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without groupingAlso have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).In future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
this patch has some bugs but it is good prototype (it's more stable
than old patch):
I'm not sure if you're at the point that you're interested in bug reports, but
here's something that didn't behave as expected:
5432 josh@josh*# create table gsettest (prod_id integer, cust_id integer,
quantity integer);
CREATE TABLE
5432 josh@josh*# insert into gsettest select floor(random() * 10)::int,
floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
100);
INSERT 0 100
5432 josh@josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
cube (prod_id, cust_id) order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
5 | 7 | 4
8 | 16 | 3
9 | 19 | 8
4 | 13 | 3
8 | 8 | 15
5 | 2 | 4
7 | 6 | 7
6 | 6 | 3
</snip>
Note that the results aren't sorted. The following, though, works around it:
5432 josh@josh*# select * from (select prod_id, cust_id, sum(quantity) from
gsettest group by cube (prod_id, cust_id)) f order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
0 | 2 | 8
0 | 4 | 8
0 | 5 | 2
0 | 7 | 11
0 | 8 | 7
0 | 9 | 1
0 | 12 | 3
0 | 14 | 7
0 | 16 | 5
0 | 17 | 8
0 | 18 | 9
0 | 19 | 2
0 | | 71
</snip>
EXPLAIN output is as follows:
5432 josh@josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
group by cube (prod_id, cust_id) order by 1, 2;
QUERY PLAN
---------------------------------------------------------------------------
Append (cost=193.54..347.71 rows=601 width=9)
CTE **g**
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: gsettest.prod_id, gsettest.cust_id
-> Seq Scan on gsettest (cost=0.00..29.40 rows=1940 width=12)
-> HashAggregate (cost=53.35..55.85 rows=200 width=12)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=12)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> Aggregate (cost=43.65..43.66 rows=1 width=4)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=4)
(13 rows)
...and without the ORDER BY clause just to prove that it really is the reason
for the Sort step...
5432 josh@josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
group by cube (prod_id, cust_id);
QUERY PLAN
------------------------------------------------------------------------
Append (cost=82.75..236.92 rows=601 width=9)
CTE **g**
-> Seq Scan on gsettest (cost=0.00..29.40 rows=1940 width=12)
-> HashAggregate (cost=53.35..55.85 rows=200 width=12)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=12)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> Aggregate (cost=43.65..43.66 rows=1 width=4)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=4)
(11 rows)
I'm hoping I'll get a chance to poke at the patch some. This could be very
useful...
- Josh / eggyknap
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
this patch has some bugs but it is good prototype (it's more stable
than old patch):I'm not sure if you're at the point that you're interested in bug reports, but
here's something that didn't behave as expected:5432 josh@josh*# create table gsettest (prod_id integer, cust_id integer,
quantity integer);
CREATE TABLE
5432 josh@josh*# insert into gsettest select floor(random() * 10)::int,
floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
100);
INSERT 0 100
5432 josh@josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
cube (prod_id, cust_id) order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
5 | 7 | 4
8 | 16 | 3
9 | 19 | 8
4 | 13 | 3
8 | 8 | 15
5 | 2 | 4
7 | 6 | 7
6 | 6 | 3
</snip>Note that the results aren't sorted. The following, though, works around it:
I thing, so result should not be sorted - it's same like normal group by.
regards
Pavel Stehule
Show quoted text
5432 josh@josh*# select * from (select prod_id, cust_id, sum(quantity) from
gsettest group by cube (prod_id, cust_id)) f order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
0 | 2 | 8
0 | 4 | 8
0 | 5 | 2
0 | 7 | 11
0 | 8 | 7
0 | 9 | 1
0 | 12 | 3
0 | 14 | 7
0 | 16 | 5
0 | 17 | 8
0 | 18 | 9
0 | 19 | 2
0 | | 71
</snip>EXPLAIN output is as follows:
5432 josh@josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
group by cube (prod_id, cust_id) order by 1, 2;
QUERY PLAN
---------------------------------------------------------------------------
Append (cost=193.54..347.71 rows=601 width=9)
CTE **g**
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: gsettest.prod_id, gsettest.cust_id
-> Seq Scan on gsettest (cost=0.00..29.40 rows=1940 width=12)
-> HashAggregate (cost=53.35..55.85 rows=200 width=12)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=12)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> Aggregate (cost=43.65..43.66 rows=1 width=4)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=4)
(13 rows)...and without the ORDER BY clause just to prove that it really is the reason
for the Sort step...5432 josh@josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
group by cube (prod_id, cust_id);
QUERY PLAN
------------------------------------------------------------------------
Append (cost=82.75..236.92 rows=601 width=9)
CTE **g**
-> Seq Scan on gsettest (cost=0.00..29.40 rows=1940 width=12)
-> HashAggregate (cost=53.35..55.85 rows=200 width=12)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=12)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> Aggregate (cost=43.65..43.66 rows=1 width=4)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=4)
(11 rows)I'm hoping I'll get a chance to poke at the patch some. This could be very
useful...- Josh / eggyknap
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)iEYEARECAAYFAkoKOdUACgkQRiRfCGf1UMOpFQCeJGQftMheSi6blMwheK4HI89p
E7cAnjdWi4FaerR/+RTBeSv9Zc0RRXQ3
=xW04
-----END PGP SIGNATURE-----
On Wed, May 13, 2009 at 06:29:41AM +0200, Pavel Stehule wrote:
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
this patch has some bugs but it is good prototype (it's more stable
than old patch):I'm not sure if you're at the point that you're interested in bug reports, but
here's something that didn't behave as expected:5432 josh@josh*# create table gsettest (prod_id integer, cust_id integer,
quantity integer);
CREATE TABLE
5432 josh@josh*# insert into gsettest select floor(random() * 10)::int,
floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
100);
INSERT 0 100
5432 josh@josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
cube (prod_id, cust_id) order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
5 | 7 | 4
8 | 16 | 3
9 | 19 | 8
4 | 13 | 3
8 | 8 | 15
5 | 2 | 4
7 | 6 | 7
6 | 6 | 3
</snip>Note that the results aren't sorted. The following, though, works around it:
I thing, so result should not be sorted - it's same like normal group by.
Normal GROUP BY wouldn't have ignored the ORDER BY clause I included.
- Josh
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Wed, May 13, 2009 at 06:29:41AM +0200, Pavel Stehule wrote:
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
this patch has some bugs but it is good prototype (it's more stable
than old patch):I'm not sure if you're at the point that you're interested in bug reports, but
here's something that didn't behave as expected:5432 josh@josh*# create table gsettest (prod_id integer, cust_id integer,
quantity integer);
CREATE TABLE
5432 josh@josh*# insert into gsettest select floor(random() * 10)::int,
floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
100);
INSERT 0 100
5432 josh@josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
cube (prod_id, cust_id) order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
5 | 7 | 4
8 | 16 | 3
9 | 19 | 8
4 | 13 | 3
8 | 8 | 15
5 | 2 | 4
7 | 6 | 7
6 | 6 | 3
</snip>Note that the results aren't sorted. The following, though, works around it:
I thing, so result should not be sorted - it's same like normal group by.
Normal GROUP BY wouldn't have ignored the ORDER BY clause I included.
sorry, now I understand - simply it is a bug. I fixed it
Thank You
Pavel
Show quoted text
- Josh
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)iEYEARECAAYFAkoKxLQACgkQRiRfCGf1UMOj/wCgkPnRiheRr+BNPLBCjzA9XlFW
0rsAoI0eOGSGlxIv0eNB8zqum7kw/Cqw
=FCTz
-----END PGP SIGNATURE-----
Robert Haas <robertmhaas@gmail.com> writes:
But that leads me to a question - does the existing HashAggregate code
make any attempt to obey work_mem?
No.
regards, tom lane
On Wed, May 13, 2009 at 03:12:51PM +0200, Pavel Stehule wrote:
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Wed, May 13, 2009 at 06:29:41AM +0200, Pavel Stehule wrote:
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
this patch has some bugs but it is good prototype (it's more stable
than old patch):I'm not sure if you're at the point that you're interested in bug reports, but
here's something that didn't behave as expected:5432 josh@josh*# create table gsettest (prod_id integer, cust_id integer,
quantity integer);
CREATE TABLE
5432 josh@josh*# insert into gsettest select floor(random() * 10)::int,
floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
100);
INSERT 0 100
5432 josh@josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
cube (prod_id, cust_id) order by 1, 2;
�prod_id | cust_id | sum
---------+---------+-----
� � � 5 | � � � 7 | � 4
� � � 8 | � � �16 | � 3
� � � 9 | � � �19 | � 8
� � � 4 | � � �13 | � 3
� � � 8 | � � � 8 | �15
� � � 5 | � � � 2 | � 4
� � � 7 | � � � 6 | � 7
� � � 6 | � � � 6 | � 3
</snip>Note that the results aren't sorted. The following, though, works around it:
I thing, so result should not be sorted - it's same like normal group by.
Normal GROUP BY wouldn't have ignored the ORDER BY clause I included.
sorry, now I understand - simply it is a bug. I fixed it
Where's the new patch?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Here is.
I checked result with Oracle and basic results are same with one exception
this patch doesn't well do with expr specified sets
this result is correct
postgres=# select selling_date, baguette, canteen, sum(items),
grouping(baguette), grouping(selling_date),
grouping_id(baguette,selling_date) from baguette_selling group by
grouping sets(baguette, selling_date, canteen,());
selling_date | baguette | canteen | sum | grouping | grouping | grouping_id
--------------+----------+---------+-----+----------+----------+-------------
| golf | | 9 | 0 | 1 | 1
| buster | | 20 | 0 | 1 | 1
2007-10-30 | | | 17 | 1 | 0 | 2
2007-10-31 | | | 12 | 1 | 0 | 2
| | Prague | 14 | 1 | 1 | 3
| | Berlin | 15 | 1 | 1 | 3
| | | 29 | 1 | 1 | 3
(7 rows)
but this result not:
postgres=# select extract(day from selling_date), selling_date,
baguette, canteen, sum(items), grouping(baguette),
grouping(selling_date), grouping_id(baguette,selling_date) from
baguette_selling group by grouping sets(baguette, selling_date,
canteen, extract(day from selling_date))
;
date_part | selling_date | baguette | canteen | sum | grouping |
grouping | grouping_id
-----------+--------------+----------+---------+-----+----------+----------+-------------
| | golf | | 9 | 0 |
1 | 1
| | buster | | 20 | 0 |
1 | 1
30 | 2007-10-30 | | | 17 | 1 |
0 | 2
31 | 2007-10-31 | | | 12 | 1 |
0 | 2
| | | Prague | 14 | 1 |
1 | 3
| | | Berlin | 15 | 1 |
1 | 3
| | | | 29 | 1 |
1 | 3
(7 rows)
date_part column is problematic.
regards
Pavel Stehule
2009/5/13 David Fetter <david@fetter.org>:
Show quoted text
On Wed, May 13, 2009 at 03:12:51PM +0200, Pavel Stehule wrote:
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Wed, May 13, 2009 at 06:29:41AM +0200, Pavel Stehule wrote:
2009/5/13 Joshua Tolley <eggyknap@gmail.com>:
On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
this patch has some bugs but it is good prototype (it's more stable
than old patch):I'm not sure if you're at the point that you're interested in bug reports, but
here's something that didn't behave as expected:5432 josh@josh*# create table gsettest (prod_id integer, cust_id integer,
quantity integer);
CREATE TABLE
5432 josh@josh*# insert into gsettest select floor(random() * 10)::int,
floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
100);
INSERT 0 100
5432 josh@josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
cube (prod_id, cust_id) order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
5 | 7 | 4
8 | 16 | 3
9 | 19 | 8
4 | 13 | 3
8 | 8 | 15
5 | 2 | 4
7 | 6 | 7
6 | 6 | 3
</snip>Note that the results aren't sorted. The following, though, works around it:
I thing, so result should not be sorted - it's same like normal group by.
Normal GROUP BY wouldn't have ignored the ORDER BY clause I included.
sorry, now I understand - simply it is a bug. I fixed it
Where's the new patch?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.comRemember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachments:
gsets-0.5.difftext/x-patch; charset=US-ASCII; name=gsets-0.5.diffDownload
*** ./src/backend/nodes/copyfuncs.c.orig 2009-05-04 19:54:52.000000000 +0200
--- ./src/backend/nodes/copyfuncs.c 2009-05-13 18:27:38.000000000 +0200
***************
*** 1056,1061 ****
--- 1056,1118 ----
}
/*
+ * _copyGroupingSetsFunc
+ */
+ static GroupingSetsFunc *
+ _copyGroupingSetsFunc(GroupingSetsFunc *from)
+ {
+ GroupingSetsFunc *newnode = makeNode(GroupingSetsFunc);
+
+ COPY_SCALAR_FIELD(identity);
+ COPY_NODE_FIELD(expr);
+ COPY_NODE_FIELD(expr_list);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
+ * _copyGroupingSetsSpec
+ */
+ static GroupingSetsSpec *
+ _copyGroupingSetsSpec(GroupingSetsSpec *from)
+ {
+ GroupingSetsSpec *newnode = makeNode(GroupingSetsSpec);
+
+ COPY_NODE_FIELD(set_list);
+ COPY_SCALAR_FIELD(has_empty_set);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
+ * _copyEmptySet
+ */
+ static EmptySet *
+ _copyEmptySet(EmptySet *from)
+ {
+ EmptySet *newnode = makeNode(EmptySet);
+
+ return newnode;
+ }
+
+ /*
+ *
+ */
+ static GroupByClause *
+ _copyGroupByClause(GroupByClause *from)
+ {
+ GroupByClause *newnode = makeNode(GroupByClause);
+
+ COPY_SCALAR_FIELD(all);
+ COPY_NODE_FIELD(fields);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
* _copyScalarArrayOpExpr
*/
static ScalarArrayOpExpr *
***************
*** 4093,4098 ****
--- 4150,4167 ----
case T_XmlSerialize:
retval = _copyXmlSerialize(from);
break;
+ case T_GroupingSetsFunc:
+ retval = _copyGroupingSetsFunc(from);
+ break;
+ case T_GroupingSetsSpec:
+ retval = _copyGroupingSetsSpec(from);
+ break;
+ case T_EmptySet:
+ retval = _copyEmptySet(from);
+ break;
+ case T_GroupByClause:
+ retval = _copyGroupByClause(from);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** ./src/backend/nodes/equalfuncs.c.orig 2009-05-04 20:01:09.000000000 +0200
--- ./src/backend/nodes/equalfuncs.c 2009-05-13 17:42:23.000000000 +0200
***************
*** 290,295 ****
--- 290,332 ----
}
static bool
+ _equalGroupingSetsFunc(GroupingSetsFunc *a, GroupingSetsFunc *b)
+ {
+ COMPARE_SCALAR_FIELD(identity);
+ COMPARE_NODE_FIELD(expr);
+ COMPARE_NODE_FIELD(expr_list);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
+ _equalGroupingSetsSpec(GroupingSetsSpec *a, GroupingSetsSpec *b)
+ {
+ COMPARE_SCALAR_FIELD(set_list);
+ COMPARE_SCALAR_FIELD(has_empty_set);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
+ _equalEmptySet(EmptySet *a, EmptySet *b)
+ {
+ return true;
+ }
+
+ static bool
+ _equalGroupByClause(GroupByClause *a, GroupByClause *b)
+ {
+ COMPARE_SCALAR_FIELD(all);
+ COMPARE_NODE_FIELD(fields);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
_equalScalarArrayOpExpr(ScalarArrayOpExpr *a, ScalarArrayOpExpr *b)
{
COMPARE_SCALAR_FIELD(opno);
***************
*** 2871,2876 ****
--- 2908,2925 ----
case T_XmlSerialize:
retval = _equalXmlSerialize(a, b);
break;
+ case T_GroupingSetsFunc:
+ retval = _equalGroupingSetsFunc(a, b);
+ break;
+ case T_GroupingSetsSpec:
+ retval = _equalGroupingSetsSpec(a, b);
+ break;
+ case T_EmptySet:
+ retval = _equalEmptySet(a, b);
+ break;
+ case T_GroupByClause:
+ retval = _equalGroupByClause(a, b);
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
*** ./src/backend/nodes/nodeFuncs.c.orig 2009-05-04 21:56:15.000000000 +0200
--- ./src/backend/nodes/nodeFuncs.c 2009-05-05 14:05:41.000000000 +0200
***************
*** 917,922 ****
--- 917,928 ----
/* just use argument's location */
loc = exprLocation((Node *) ((PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupingSetsFunc:
+ loc = ((GroupingSetsFunc *) expr)->location;
+ break;
+ case T_GroupingSetsSpec:
+ loc = ((GroupingSetsSpec *) expr)->location;
+ break;
default:
/* for any other node type it's just unknown... */
loc = -1;
***************
*** 1342,1347 ****
--- 1348,1368 ----
break;
case T_PlaceHolderInfo:
return walker(((PlaceHolderInfo *) node)->ph_var, context);
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+
+ if (expression_tree_walker(gsf->expr, walker, context))
+ return true;
+ if (expression_tree_walker((Node *) gsf->expr_list, walker, context))
+ return true;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ if (expression_tree_walker((Node *)((GroupingSetsSpec *) node)->set_list,
+ walker, context))
+ return true;
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
***************
*** 2016,2021 ****
--- 2037,2063 ----
return (Node *) newnode;
}
break;
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ GroupingSetsFunc *newnode;
+
+ FLATCOPY(newnode, gsf, GroupingSetsFunc);
+ MUTATE(newnode->expr, gsf->expr, Node *);
+ MUTATE(newnode->expr_list, gsf->expr, List *);
+ return (Node *) newnode;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) node;
+ GroupingSetsSpec *newnode;
+
+ FLATCOPY(newnode, gss, GroupingSetsSpec);
+ MUTATE(newnode->set_list, gss->set_list, List *);
+ return (Node *) newnode;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
*** ./src/backend/nodes/outfuncs.c.orig 2009-05-04 20:04:05.000000000 +0200
--- ./src/backend/nodes/outfuncs.c 2009-05-13 17:42:33.000000000 +0200
***************
*** 902,907 ****
--- 902,943 ----
}
static void
+ _outGroupingSetsFunc(StringInfo str, GroupingSetsFunc *node)
+ {
+ WRITE_NODE_TYPE("GROUPINGSETSFUNC");
+
+ WRITE_ENUM_FIELD(identity, GroupingSetsFuncIdentity);
+ WRITE_NODE_FIELD(expr);
+ WRITE_NODE_FIELD(expr_list);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
+ _outGroupingSetsSpec(StringInfo str, GroupingSetsSpec *node)
+ {
+ WRITE_NODE_TYPE("GROUPINGSETSSPEC");
+
+ WRITE_NODE_FIELD(set_list);
+ WRITE_BOOL_FIELD(has_empty_set);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
+ _outEmptySet(StringInfo str, EmptySet *node)
+ {
+ WRITE_NODE_TYPE("EMPTYSET");
+ }
+
+ static void
+ _outGroupByClause(StringInfo str, GroupByClause *node)
+ {
+ WRITE_NODE_TYPE("GROUPBYCLAUSE");
+ WRITE_BOOL_FIELD(all);
+ WRITE_NODE_FIELD(fields);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
_outScalarArrayOpExpr(StringInfo str, ScalarArrayOpExpr *node)
{
WRITE_NODE_TYPE("SCALARARRAYOPEXPR");
***************
*** 2774,2779 ****
--- 2810,2827 ----
case T_XmlSerialize:
_outXmlSerialize(str, obj);
break;
+ case T_GroupingSetsFunc:
+ _outGroupingSetsFunc(str, obj);
+ break;
+ case T_GroupingSetsSpec:
+ _outGroupingSetsSpec(str, obj);
+ break;
+ case T_EmptySet:
+ _outEmptySet(str, obj);
+ break;
+ case T_GroupByClause:
+ _outGroupByClause(str, obj);
+ break;
default:
*** ./src/backend/nodes/readfuncs.c.orig 2009-05-04 20:09:32.000000000 +0200
--- ./src/backend/nodes/readfuncs.c 2009-05-13 17:42:42.000000000 +0200
***************
*** 583,588 ****
--- 583,645 ----
}
/*
+ * _readGroupingSetsFunc
+ */
+ static GroupingSetsFunc *
+ _readGroupingSetsFunc(void)
+ {
+ READ_LOCALS(GroupingSetsFunc);
+
+ READ_ENUM_FIELD(identity, GroupingSetsFuncIdentity);
+ READ_NODE_FIELD(expr);
+ READ_NODE_FIELD(expr_list);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+ /*
+ * _readGroupingSetsSpec
+ */
+ static GroupingSetsSpec *
+ _readGroupingSetsSpec(void)
+ {
+ READ_LOCALS(GroupingSetsSpec);
+
+ READ_NODE_FIELD(set_list);
+ READ_BOOL_FIELD(has_empty_set);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+ /*
+ * _readEmptySet
+ */
+ static EmptySet *
+ _readEmptySet(void)
+ {
+ READ_LOCALS_NO_FIELDS(EmptySet);
+
+ READ_DONE();
+ }
+
+ /*
+ *_readGroupByClause
+ */
+ static GroupByClause *
+ _readGroupByClause(void)
+ {
+ READ_LOCALS(GroupByClause);
+
+ READ_BOOL_FIELD(all);
+ READ_NODE_FIELD(fields);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+ /*
* _readScalarArrayOpExpr
*/
static ScalarArrayOpExpr *
***************
*** 1274,1279 ****
--- 1331,1344 ----
return_value = _readNotifyStmt();
else if (MATCH("DECLARECURSOR", 13))
return_value = _readDeclareCursorStmt();
+ else if (MATCH("GROUPINGSETSFUNC", 16))
+ return_value = _readGroupingSetsFunc();
+ else if (MATCH("GROUPINGSETSSPEC", 16))
+ return_value = _readGroupingSetsSpec();
+ else if (MATCH("EMPTYSET", 8))
+ return_value = _readEmptySet();
+ else if (MATCH("GROUPBYCLAUSE", 13))
+ return_value = _readGroupByClause();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
*** ./src/backend/parser/analyze.c.orig 2009-05-05 14:14:42.000000000 +0200
--- ./src/backend/parser/analyze.c 2009-05-13 20:53:42.000000000 +0200
***************
*** 34,39 ****
--- 34,40 ----
#include "parser/parse_clause.h"
#include "parser/parse_coerce.h"
#include "parser/parse_cte.h"
+ #include "parser/parse_gsets.h"
#include "parser/parse_oper.h"
#include "parser/parse_relation.h"
#include "parser/parse_target.h"
***************
*** 148,153 ****
--- 149,310 ----
}
/*
+ * process GROUPING SETS
+ */
+ static SelectStmt *
+ makeSelectStmt(List *targetList, List *fromClause)
+ {
+ SelectStmt *n = makeNode(SelectStmt);
+ n->distinctClause = NULL;
+ n->intoClause = NULL;
+ n->targetList = targetList;
+ n->fromClause = fromClause;
+ n->whereClause = NULL;
+ n->groupClause = NULL;
+ n->havingClause = NULL;
+ n->windowClause = NIL;
+ n->withClause = NULL;
+ n->valuesLists = NIL;
+ n->sortClause = NIL;
+ n->limitOffset = NULL;
+ n->limitCount = NULL;
+ n->lockingClause = NIL;
+ n->op = SETOP_NONE;
+ n->all = false;
+ n->larg = NULL;
+ n->rarg = NULL;
+ return n;
+ }
+
+ static List *
+ makeStarTargetList(void)
+ {
+ ResTarget *rt = makeNode(ResTarget);
+
+ rt->name = NULL;
+ rt->indirection = NIL;
+ rt->val = (Node *) makeNode(ColumnRef);
+ ((ColumnRef *) rt->val)->fields = list_make1(makeNode(A_Star));
+ rt->location = -1;
+
+ return list_make1(rt);
+ }
+
+ static SelectStmt *
+ transformGroupingSets(ParseState *pstate, SelectStmt *stmt)
+ {
+ if (stmt->groupClause && IsA(stmt->groupClause, GroupByClause))
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) expandGroupingSets(pstate,
+ (List *)((GroupByClause *)stmt->groupClause)->fields);
+
+ if (pstate->p_hasGroupingSets)
+ {
+ CommonTableExpr *cte = makeNode(CommonTableExpr);
+ SelectStmt *cteedstmt;
+ int ngroupingsets = list_length(gss->set_list) + (gss->has_empty_set ? 1 : 0);
+ bool all = ((GroupByClause *) stmt->groupClause)->all;
+
+ cteedstmt = makeSelectStmt(NIL, NIL);
+ cteedstmt->intoClause = stmt->intoClause;
+ cteedstmt->sortClause = stmt->sortClause;
+ cteedstmt->limitOffset = stmt->limitOffset;
+ cteedstmt->limitCount = stmt->limitCount;
+ cteedstmt->lockingClause = stmt->lockingClause;
+
+ cte->ctename = "**g**";
+ cte->ctequery = (Node *) stmt;
+ cte->location = -1;
+
+ cteedstmt->withClause = makeNode(WithClause);
+ cteedstmt->withClause->ctes = list_make1(cte);
+ cteedstmt->withClause->recursive = false;
+ cteedstmt->withClause->location = -1;
+
+ /* when is more than one grouping set, then we should generate setop node */
+ if (ngroupingsets > 1)
+ {
+ /* add quuery under union all for every grouping set */
+ SelectStmt *larg = NULL;
+ SelectStmt *rarg;
+ ListCell *lc;
+
+ foreach(lc, gss->set_list)
+ {
+ List *groupClause;
+
+ Assert(IsA(lfirst(lc), List));
+ groupClause = (List *) lfirst(lc);
+
+ if (larg == NULL)
+ {
+ larg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ larg->groupClause = (Node *) groupClause;
+ larg->havingClause = copyObject(stmt->havingClause);
+ }
+ else
+ {
+ SelectStmt *setop = makeSelectStmt(NIL, NIL);
+
+ rarg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ rarg->groupClause = (Node *) groupClause;
+ rarg->havingClause = copyObject(stmt->havingClause);
+
+ setop->op = SETOP_UNION;
+ setop->larg = larg;
+ setop->rarg = rarg;
+ setop->all = all;
+
+ larg = setop;
+ }
+ }
+ if (gss->has_empty_set)
+ {
+ SelectStmt *setop = makeSelectStmt(NIL, NIL);
+
+ rarg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ rarg->havingClause = copyObject(stmt->havingClause);
+
+ setop->op = SETOP_UNION;
+ setop->larg = larg;
+ setop->rarg = rarg;
+ setop->all = all;
+
+ larg = setop;
+ }
+ /* merge larg to result */
+ cteedstmt->op = larg->op;
+ cteedstmt->larg = larg->larg;
+ cteedstmt->rarg = larg->rarg;
+ cteedstmt->all = larg->all;
+ }
+ else
+ {
+ /* there isn't used setop node */
+ cteedstmt->targetList = copyObject(stmt->targetList);
+ cteedstmt->fromClause = list_make1(makeRangeVar(NULL, "**g**", -1));
+ }
+
+ ((SelectStmt *)cte->ctequery)->targetList = makeStarTargetList();
+ ((SelectStmt *)cte->ctequery)->groupClause = NULL;
+ ((SelectStmt *)cte->ctequery)->sortClause = NIL;
+ ((SelectStmt *)cte->ctequery)->limitOffset = stmt->limitOffset;
+ ((SelectStmt *)cte->ctequery)->limitCount = stmt->limitCount;
+ ((SelectStmt *)cte->ctequery)->lockingClause = stmt->lockingClause;
+
+ return cteedstmt;
+ }
+ else
+ /* trim GroupByClause to groupByClause */
+ stmt->groupClause = (Node *)((GroupByClause *)stmt->groupClause)->fields;
+ }
+ return stmt;
+ }
+
+ /*
* transformStmt -
* transform a Parse tree into a Query tree.
*/
***************
*** 176,181 ****
--- 333,340 ----
case T_SelectStmt:
{
SelectStmt *n = (SelectStmt *) parseTree;
+
+ n = transformGroupingSets(pstate, n);
if (n->valuesLists)
result = transformValuesClause(pstate, n);
***************
*** 826,835 ****
true /* fix unknowns */ );
qry->groupClause = transformGroupClause(pstate,
! stmt->groupClause,
&qry->targetList,
qry->sortClause,
! false);
if (stmt->distinctClause == NIL)
{
--- 985,994 ----
true /* fix unknowns */ );
qry->groupClause = transformGroupClause(pstate,
! (List *) stmt->groupClause,
&qry->targetList,
qry->sortClause,
! false);
if (stmt->distinctClause == NIL)
{
***************
*** 921,927 ****
Assert(stmt->targetList == NIL);
Assert(stmt->fromClause == NIL);
Assert(stmt->whereClause == NULL);
! Assert(stmt->groupClause == NIL);
Assert(stmt->havingClause == NULL);
Assert(stmt->windowClause == NIL);
Assert(stmt->op == SETOP_NONE);
--- 1080,1086 ----
Assert(stmt->targetList == NIL);
Assert(stmt->fromClause == NIL);
Assert(stmt->whereClause == NULL);
! Assert(stmt->groupClause == NULL);
Assert(stmt->havingClause == NULL);
Assert(stmt->windowClause == NIL);
Assert(stmt->op == SETOP_NONE);
*** ./src/backend/parser/gram.y.orig 2009-05-04 18:07:44.000000000 +0200
--- ./src/backend/parser/gram.y 2009-05-13 17:43:23.000000000 +0200
***************
*** 118,123 ****
--- 118,125 ----
static Node *makeNullAConst(int location);
static Node *makeAConst(Value *v, int location);
static Node *makeBoolAConst(bool state, int location);
+ static Node *makeGroupingSetsFunc(GroupingSetsFuncIdentity identity, Node *expr,
+ List *expr_list, int location);
static FuncCall *makeOverlaps(List *largs, List *rargs, int location);
static void check_qualified_name(List *names);
static List *check_func_name(List *names);
***************
*** 284,290 ****
target_list insert_column_list set_target_list
set_clause_list set_clause multiple_set_clause
ctext_expr_list ctext_row def_list indirection opt_indirection
! reloption_list group_clause TriggerFuncArgs select_limit
opt_select_limit opclass_item_list opclass_drop_list
opt_opfamily transaction_mode_list_or_empty
TableFuncElementList opt_type_modifiers
--- 286,292 ----
target_list insert_column_list set_target_list
set_clause_list set_clause multiple_set_clause
ctext_expr_list ctext_row def_list indirection opt_indirection
! reloption_list TriggerFuncArgs select_limit
opt_select_limit opclass_item_list opclass_drop_list
opt_opfamily transaction_mode_list_or_empty
TableFuncElementList opt_type_modifiers
***************
*** 416,421 ****
--- 418,427 ----
%type <str> opt_existing_window_name
%type <ival> opt_frame_clause frame_extent frame_bound
+ %type <node> grouping_element empty_grouping_set grouping_sets_spec
+ group_clause
+ %type <list> grouping_element_list
+ %type <boolean> opt_grouping_set_quantifier
/*
* If you make any token changes, update the keyword table in
***************
*** 438,444 ****
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
CREATEROLE CREATEUSER CROSS CSV CURRENT_P
! CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
--- 444,450 ----
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
CREATEROLE CREATEUSER CROSS CSV CURRENT_P
! CUBE CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
***************
*** 451,457 ****
FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOLLOWING FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
! GLOBAL GRANT GRANTED GREATEST GROUP_P
HANDLER HAVING HEADER_P HOLD HOUR_P
--- 457,463 ----
FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOLLOWING FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
! GLOBAL GRANT GRANTED GREATEST GROUP_P GROUPING GROUPING_ID
HANDLER HAVING HEADER_P HOLD HOUR_P
***************
*** 485,494 ****
RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
! RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
! SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SHARE
SHOW SIMILAR SIMPLE SMALLINT SOME STABLE STANDALONE_P START STATEMENT
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING SUPERUSER_P
SYMMETRIC SYSID SYSTEM_P
--- 491,500 ----
RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
! RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
! SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SETS SHARE
SHOW SIMILAR SIMPLE SMALLINT SOME STABLE STANDALONE_P START STATEMENT
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING SUPERUSER_P
SYMMETRIC SYSID SYSTEM_P
***************
*** 7227,7234 ****
;
group_clause:
! GROUP_P BY expr_list { $$ = $3; }
! | /*EMPTY*/ { $$ = NIL; }
;
having_clause:
--- 7233,7286 ----
;
group_clause:
! GROUP_P BY opt_grouping_set_quantifier grouping_element_list
! {
! GroupByClause *clause = makeNode(GroupByClause);
! clause->all = $3;
! clause->fields = $4;
! clause->location = @1;
! $$ = (Node *) clause;
! }
! | /*EMPTY*/
! {
! $$ = NULL;
! }
! ;
!
! grouping_sets_spec:
! GROUPING SETS '(' grouping_element_list ')'
! {
! /* We cannot identify and drop empty sets yet. */
! GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
! gss->set_list = $4;
! gss->has_empty_set = false;
! gss->location = @1;
! $$ = (Node *) gss;
! }
! ;
!
! grouping_element_list:
! grouping_element { $$ = list_make1($1); }
! | grouping_element_list ',' grouping_element { $$ = lappend($1, $3); }
! ;
!
! grouping_element:
! a_expr { $$ = $1; }
! | grouping_sets_spec { $$ = $1; }
! | empty_grouping_set { $$ = $1; }
! ;
!
! empty_grouping_set:
! '(' ')'
! {
! $$ = makeNode(EmptySet);
! }
! ;
!
! opt_grouping_set_quantifier:
! ALL { $$ = true; }
! | DISTINCT { $$ = false; }
! | /*EMPTY*/ { $$ = true; }
;
having_clause:
***************
*** 9266,9271 ****
--- 9318,9346 ----
n->location = @1;
$$ = (Node *)n;
}
+ | GROUPING '(' a_expr ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_GROUPING, $3, NIL, @1);
+ }
+ | GROUPING_ID '(' expr_list ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_GROUPING_ID, NULL, $3, @1);
+ }
+ | CUBE '(' expr_list ')'
+ {
+ /*
+ * Cube should be used in two different contexts. First,
+ * as part of Grouping Sets specification. Second, as
+ * normal UDF function from contrib cube module. When isnot
+ * grouping sets context, then node is transformated to
+ * FuncCall node later.
+ */
+ $$ = makeGroupingSetsFunc(FUNCTION_CUBE, NULL, $3, @1);
+ }
+ | ROLLUP '(' expr_list ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_ROLLUP, NULL, $3, @1);
+ }
;
/*
***************
*** 10320,10325 ****
--- 10395,10401 ----
| SERVER
| SESSION
| SET
+ | SETS
| SHARE
| SHOW
| SIMPLE
***************
*** 10395,10400 ****
--- 10471,10477 ----
| EXTRACT
| FLOAT_P
| GREATEST
+ | GROUPING_ID
| INOUT
| INT_P
| INTEGER
***************
*** 10489,10494 ****
--- 10566,10572 ----
| COLUMN
| CONSTRAINT
| CREATE
+ | CUBE
| CURRENT_CATALOG
| CURRENT_DATE
| CURRENT_ROLE
***************
*** 10510,10515 ****
--- 10588,10594 ----
| FROM
| GRANT
| GROUP_P
+ | GROUPING
| HAVING
| IN_P
| INITIALLY
***************
*** 10533,10538 ****
--- 10612,10618 ----
| PRIMARY
| REFERENCES
| RETURNING
+ | ROLLUP
| SELECT
| SESSION_USER
| SOME
***************
*** 11047,11052 ****
--- 11127,11144 ----
return (Node *) x;
}
+ static Node *
+ makeGroupingSetsFunc(GroupingSetsFuncIdentity identity, Node *expr, List *expr_list, int location)
+ {
+ GroupingSetsFunc *gsfunc = makeNode(GroupingSetsFunc);
+
+ gsfunc->identity = identity;
+ gsfunc->expr = expr;
+ gsfunc->expr_list = expr_list;
+ gsfunc->location = location;
+ return (Node *) gsfunc;
+ }
+
/* parser_init()
* Initialize to parse one query string
*/
*** ./src/backend/parser/Makefile.orig 2009-05-04 21:23:31.000000000 +0200
--- ./src/backend/parser/Makefile 2009-05-04 21:24:06.000000000 +0200
***************
*** 14,20 ****
OBJS= analyze.o gram.o keywords.o parser.o parse_agg.o parse_cte.o parse_clause.o \
parse_expr.o parse_func.o parse_node.o parse_oper.o parse_relation.o \
! parse_type.o parse_coerce.o parse_target.o parse_utilcmd.o scansup.o kwlookup.o
FLEXFLAGS = -CF
--- 14,21 ----
OBJS= analyze.o gram.o keywords.o parser.o parse_agg.o parse_cte.o parse_clause.o \
parse_expr.o parse_func.o parse_node.o parse_oper.o parse_relation.o \
! parse_type.o parse_coerce.o parse_target.o parse_utilcmd.o scansup.o kwlookup.o \
! parse_gsets.o
FLEXFLAGS = -CF
*** ./src/backend/parser/parse_agg.c.orig 2009-05-11 21:12:59.000000000 +0200
--- ./src/backend/parser/parse_agg.c 2009-05-13 21:16:24.000000000 +0200
***************
*** 32,42 ****
--- 32,51 ----
int sublevels_up;
} check_ungrouped_columns_context;
+ typedef struct
+ {
+ ParseState *pstate;
+ List *groupClause;
+ } transform_ungrouped_target_context;
+
static void check_ungrouped_columns(Node *node, ParseState *pstate,
List *groupClauses, bool have_non_var_grouping);
static bool check_ungrouped_columns_walker(Node *node,
check_ungrouped_columns_context *context);
+ static Node * transform_ungrouped_target(Node *node, ParseState *pstate,
+ List *groupClauses);
+
/*
* transformAggregateCall -
***************
*** 318,324 ****
* WINDOW clauses. For that matter, it's also going to examine the
* grouping expressions themselves --- but they'll all pass the test ...
*/
! clause = (Node *) qry->targetList;
if (hasJoinRTEs)
clause = flatten_join_alias_vars(root, clause);
check_ungrouped_columns(clause, pstate,
--- 327,344 ----
* WINDOW clauses. For that matter, it's also going to examine the
* grouping expressions themselves --- but they'll all pass the test ...
*/
! if (pstate->parentParseState && pstate->parentParseState->p_hasGroupingSets)
! {
! clause = (Node *) transform_ungrouped_target((Node *) qry->targetList,
! pstate,
! groupClauses);
! /* HACK!!! - move to transform part*/
! qry->targetList = clause;
! }
! else
! clause = (Node *) qry->targetList;
!
!
if (hasJoinRTEs)
clause = flatten_join_alias_vars(root, clause);
check_ungrouped_columns(clause, pstate,
***************
*** 426,431 ****
--- 446,517 ----
}
/*
+ * transform_ungrouped_cols_mutator -
+ * All non aggregates non constatnt columns are replaced by NULL,
+ * grouping and grouping_id functions are replaced by constatnts.
+ */
+ static Node *
+ transform_ungrouped_target_mutator(Node *node,
+ transform_ungrouped_target_context *context)
+ {
+ if (node == NULL)
+ return NULL;
+ if (IsA(node, Aggref))
+ return node;
+
+ if (IsA(node, GroupingSetsFunc))
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ int result = 0;
+
+ if (gsf->identity == FUNCTION_GROUPING)
+ {
+ result = list_member(context->groupClause, gsf->expr) ? 0 : 1;
+ return (Node *) make_const(context->pstate, makeInteger(result), gsf->location);
+ }
+ else if (gsf->identity == FUNCTION_GROUPING_ID)
+ {
+ ListCell *el;
+
+ foreach(el, gsf->expr_list)
+ {
+ result = result << 1;
+ if (!list_member(context->groupClause, lfirst(el)))
+ result = result | 0x01;
+ }
+ return (Node *) make_const(context->pstate, makeInteger(result), gsf->location);
+ }
+ else /* replace Cube and Rollup by FuncCall node */
+ {
+ /* ToDo: Support cube function */
+ }
+ }
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ if (list_member(context->groupClause, node))
+ return node;
+ else
+ return (Node *) makeNullConst(var->vartype, var->vartypmod);
+ }
+
+ return expression_tree_mutator(node,
+ transform_ungrouped_target_mutator, context);
+ }
+
+ static Node *
+ transform_ungrouped_target(Node *node, ParseState *pstate,
+ List *groupClauses)
+ {
+ transform_ungrouped_target_context context;
+
+ context.pstate = pstate;
+ context.groupClause = groupClauses;
+ return transform_ungrouped_target_mutator(node, &context);
+ }
+ /*
* check_ungrouped_columns -
* Scan the given expression tree for ungrouped variables (variables
* that are not listed in the groupClauses list and are not within
*** ./src/backend/parser/parse_expr.c.orig 2009-05-12 22:33:29.000000000 +0200
--- ./src/backend/parser/parse_expr.c 2009-05-13 18:41:54.000000000 +0200
***************
*** 273,278 ****
--- 273,296 ----
case T_CurrentOfExpr:
result = transformCurrentOfExpr(pstate, (CurrentOfExpr *) expr);
break;
+
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) expr;
+ ListCell *lc;
+ List *expr_list = NIL;
+
+ gsf->expr = (Node *) transformExpr(pstate, (Node *) gsf->expr);
+
+ foreach(lc, gsf->expr_list)
+ {
+ expr_list = lappend(expr_list, transformExpr(pstate,
+ (Node *) lfirst(lc)));
+ }
+ gsf->expr_list = expr_list;
+ result = expr;
+ break;
+ }
/*********************************************
* Quietly accept node types that may be presented when we are
***************
*** 304,309 ****
--- 322,328 ----
case T_CoerceToDomain:
case T_CoerceToDomainValue:
case T_SetToDefault:
+ case T_EmptySet:
{
result = (Node *) expr;
break;
*** ./src/backend/parser/parse_gsets.c.orig 2009-05-04 21:13:54.000000000 +0200
--- ./src/backend/parser/parse_gsets.c 2009-05-13 18:51:59.000000000 +0200
***************
*** 0 ****
--- 1,677 ----
+ /*-------------------------------------------------------------------------
+ *
+ * parse_gsets.c
+ * parser grouping sets transformations
+ *
+ * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * $PostgreSQL: $
+ *
+ *-------------------------------------------------------------------------
+ */
+ #include "postgres.h"
+ #include "catalog/namespace.h"
+ #include "nodes/nodes.h"
+ #include "nodes/makefuncs.h"
+ #include "nodes/nodeFuncs.h"
+ #include "nodes/value.h"
+ #include "nodes/pg_list.h"
+ #include "nodes/parsenodes.h"
+ #include "optimizer/tlist.h"
+ #include "parser/parse_clause.h"
+ #include "parser/parse_gsets.h"
+ #include "parser/parse_node.h"
+ #include "rewrite/rewriteManip.h"
+
+ #include <string.h>
+
+
+ #define UNDEF_LOCATION -1
+
+
+ #define ORDER_CLAUSE 0
+ #define GROUP_CLAUSE 1
+ #define DISTINCT_ON_CLAUSE 2
+
+ #define MAX_CUBE_FIELDS 8
+
+ #define IsGroupingSetsSpec(n) (IsA(n, GroupingSetsSpec))
+ #define IsUndefLoc(n) (n == UNDEF_LOCATION)
+ #define IsNIL(n) (n == NIL)
+
+ typedef struct
+ {
+ ParseState *pstate;
+ List *groupClause;
+ bool isGlobalTL;
+ } transform_ungroup_cols_context;
+
+ static List *add_distinct_set(List *gsets_list, List *gsets);
+ static List *add_distinct_sets(List *cum_gsets_list, List *gsets_list);
+ static Node *set_multiplication(List *list, bool has_empty_set);
+ static Node *expandGroupingSetsOperator(ParseState *pstate, Node *node, bool *isemptyset);
+
+
+ /*
+ * return true, when GroupingSet is empty
+ */
+ static bool
+ IsEmptyGroupingSet(Node *node)
+ {
+ if (node == NULL)
+ return true;
+ if (IsA(node, EmptySet))
+ return true;
+ if (IsA(node, GroupingSetsSpec))
+ return IsNIL(((GroupingSetsSpec *) node)->set_list);
+
+ return false;
+ }
+
+ /*
+ * expandGroupingSets do:
+ * a) search end replace cube and rollup funccall nodes by GroupingSetsSpec
+ * nodes,
+ * b) multiply all set in GroupBy list,
+ * c) drop all None nodes.
+ *
+ */
+ Node *
+ expandGroupingSets(ParseState *pstate, List *grouplist)
+ {
+ ListCell *g;
+ List *result = NIL;
+ bool has_empty_set = false;
+ int location = UNDEF_LOCATION;
+ bool is_grouping_sets = false;
+
+ /* leave fast when grouplist is NIL */
+ if (IsNIL(grouplist))
+ return NULL;
+
+ /* detect single group by grouping set */
+ foreach(g, grouplist)
+ {
+ Node *el = (Node *) lfirst(g);
+ Node *expel;
+ bool isemptyset;
+
+ /*
+ * Is it empty set. NULL is short cut for empty set.
+ * This has disadvantage - we lost location, but it is short.
+ */
+ if (IsEmptyGroupingSet(el))
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ /* take location from first item of list */
+ if (IsUndefLoc(location))
+ location = exprLocation(el);
+
+ expel = expandGroupingSetsOperator(pstate, el, &isemptyset);
+
+ if (isemptyset)
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ if (IsA(el, GroupingSetsSpec))
+ is_grouping_sets = true;
+ else
+ {
+ switch (((GroupingSetsFunc *) el)->identity)
+ {
+ case FUNCTION_CUBE:
+ case FUNCTION_ROLLUP:
+ is_grouping_sets = true;
+ break;
+ default:
+ is_grouping_sets = false;
+ }
+ }
+
+ result = lappend(result, expel);
+ }
+
+ /*
+ * when isn't detected grouping sets, but is any empty set
+ * do implicit Grouping Set
+ */
+ if (!is_grouping_sets)
+ {
+ if (has_empty_set)
+ {
+ GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
+ gss->set_list = result;
+ gss->has_empty_set = true;
+ gss->location = location;
+ pstate->p_hasGroupingSets = true;
+
+ return (Node *) gss;
+ }
+ else
+ /* when no trick from grouping sets spec is used, return original list */
+ return (Node *) grouplist;
+ }
+
+ /* multiple all sets in GROUP BY list */
+ pstate->p_hasGroupingSets = true;
+
+ return (Node *) set_multiplication(result, has_empty_set);
+ }
+
+ /*
+ * This function has some purposes - first take list from row expression,
+ * second - remove all empty sets and remove all duplicate sets,
+ * ToDo: add flag to RowExpr that carry info about explicit ROW keyword
+ * using.
+ */
+ static List **
+ adjustElements(List *elements, int *nfields, bool *has_empty_set)
+ {
+ ListCell *l;
+ List *felements = NIL; /* filtered elements, without empty sets */
+ int _nfields;
+ List **result;
+ int i = 0;
+
+ *has_empty_set = false;
+ foreach(l, elements)
+ {
+ if (IsEmptyGroupingSet((Node *) lfirst(l)))
+ {
+ *has_empty_set = true;
+ continue;
+ }
+
+ felements = list_append_unique(felements, lfirst(l));
+ }
+
+ if (IsNIL(felements))
+ return NULL;
+
+ _nfields = list_length(felements);
+ result = palloc(_nfields * sizeof(List *));
+
+ foreach(l, felements)
+ {
+ Node *el = (Node *) lfirst(l);
+
+ switch (nodeTag(el))
+ {
+ case T_RowExpr:
+ result[i++] = ((RowExpr *) el)->args;
+ break;
+ default:
+ result[i++] = list_make1(el);
+
+ }
+ }
+
+ *nfields = _nfields;
+ return result;
+ }
+
+ /*
+ * This function add one grouping set to other. Result is list of distinct sets,
+ * etc list of distinct list.
+ *
+ * Note: the distinct in function name is not related to ALL|DISTINCT quantifier
+ * of GROUP BY clause
+ */
+ static List *
+ add_distinct_set(List *gsets_list, List *gsets)
+ {
+
+ List *uniqset = NIL; /* unique gsets */
+ ListCell *l;
+ int act_len;
+ bool found = false;
+
+ /*
+ * ensure so all fields in gsets_list are unique,
+ * all new items are unique there.
+ */
+ foreach (l, gsets)
+ uniqset = list_append_unique(uniqset, lfirst(l));
+
+ /* same lists has same length */
+ act_len = list_length(uniqset);
+
+ foreach(l, gsets_list)
+ {
+ Node *node = (Node *) lfirst(l);
+
+ Assert(IsA(node, List));
+ if (list_length((List *) node) == act_len)
+ {
+ ListCell *lc;
+
+ found = true;
+ foreach(lc, (List *) node)
+ if (!list_member(uniqset, (Node *) lfirst(lc)))
+ {
+ found = false;
+ break;
+ }
+ if (found)
+ break;
+ }
+ }
+
+ if (!found)
+ return lappend(gsets_list, uniqset);
+ else
+ return gsets_list;
+ }
+
+ /*
+ * This used in grouping sets(a, rollup(a,b))
+ */
+ static List *
+ add_distinct_sets(List *cum_gsets_list, List *gsets_list)
+ {
+ ListCell *l;
+
+ foreach(l, gsets_list)
+ {
+ Node *el = (Node *) lfirst(l);
+ Assert(IsA(el, List));
+ cum_gsets_list = add_distinct_set(cum_gsets_list, (List *) el);
+ }
+ return cum_gsets_list;
+ }
+
+ /*
+ * Evaluate CUBE and ROLLUP operators
+ * Transform from GroupingSetsFunc to GroupingSetsSpec.
+ *
+ * Note: This routine doesn't necessary replace all GroupingSetsFunc nodes. It evaluate only
+ * GROUPING SETS context. There can be any nodes elsewhere - mainly Cube as FuncCall, so
+ * we have to use walker later and check it (and possible replace it).
+ */
+ static Node *
+ expandGroupingSetsOperator(ParseState *pstate, Node *node, bool *isemptyset)
+ {
+ bool has_empty_set = false;
+ List *set_list = NIL;
+ GroupingSetsSpec *result;
+ int location;
+
+ if (IsEmptyGroupingSet(node))
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ *isemptyset = false;
+
+ switch (nodeTag(node))
+ {
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ List **sets;
+ List *sets_union;
+ int nfields = list_length(gsf->expr_list);
+
+ if (gsf->identity == FUNCTION_CUBE && nfields > MAX_CUBE_FIELDS)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("more than eight fields in CUBE operator"),
+ parser_errposition(pstate, gsf->location)));
+
+ sets = adjustElements(gsf->expr_list, &nfields, &has_empty_set);
+ if (sets == NULL)
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ switch (gsf->identity)
+ {
+ case FUNCTION_CUBE:
+ {
+ /*
+ * Grouping Sets CUBE operator
+ * This operator evaluate set {a,b} to sets {{a,b},{a},{b},{}}
+ *
+ * An combinations are generated with bitmap's shift
+ * like
+ * 111 7 {a,b,c}
+ * 110 6 {a,b}
+ * 101 5 {a,c}
+ * 100 4 {a}
+ * 011 3 {b,c}
+ * 010 2 {b}
+ * 001 1 {c}
+ * 000 0 empty set
+ */
+ unsigned int btab[] = {0x01, 0x03, 0x07, 0x0f, 0x1f, 0x3f, 0x7f, 0xff};
+ unsigned int bitmap = btab[nfields - 1];
+ unsigned int masc = bitmap;
+ unsigned int selector = 1 << (nfields - 1);
+ int i;
+
+ sets_union = NIL;
+ while (bitmap > 0)
+ {
+ unsigned int processed_bitmap = bitmap--;
+
+ sets_union = NIL;
+ for (i = 0; processed_bitmap > 0; i++)
+ {
+ if (processed_bitmap & selector)
+ sets_union = list_union(sets_union, sets[i]);
+
+ processed_bitmap = (processed_bitmap << 1) & masc;
+ }
+
+ set_list = lappend(set_list, sets_union);
+ }
+ }
+ break;
+
+ case FUNCTION_ROLLUP:
+ {
+ /*
+ * Grouping Sets ROLLUP operator
+ * This operator evaluate set {a,b} to sets {{a,b},{a},{}}
+ */
+ int i;
+ int j;
+
+ set_list = NIL;
+ for (i = 0; i < nfields; i++)
+ {
+ sets_union = NIL;
+ for (j = 0; j < nfields - i; j++)
+ sets_union = list_union(sets_union, (List *) sets[j]);
+
+ set_list = lappend(set_list, sets_union);
+ }
+ }
+ break;
+
+ default:
+ /*
+ * none here, for FUNCTION_GROUPING and FUNCTION_GROUPING_ID
+ * return original node
+ */
+ pfree(sets);
+
+ return node;
+ }
+
+ pfree(sets);
+ location = gsf->location;
+ has_empty_set = true;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ {
+ /*
+ * Grouping Sets (xxxx)
+ * a) call evaluation CUBE and ROLLUP operator
+ * b) ensure distinct sets
+ * c) find empty set inside
+ */
+ ListCell *gse;
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) node;
+
+ set_list = NIL;
+ location = gss->location;
+
+ foreach(gse, gss->set_list)
+ {
+ Node *grouping_element = (Node *) lfirst(gse);
+ Node *expanded_set = expandGroupingSetsOperator(pstate,
+ grouping_element, isemptyset);
+
+ if (*isemptyset)
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ switch (nodeTag(expanded_set))
+ {
+ case T_GroupingSetsSpec:
+ {
+ GroupingSetsSpec *egss = (GroupingSetsSpec *) expanded_set;
+
+ if (egss->has_empty_set)
+ has_empty_set = true;
+
+ if (egss->set_list != NIL)
+ set_list = add_distinct_sets(set_list, egss->set_list);
+ }
+ break;
+
+ case T_RowExpr:
+ {
+ /*
+ * parser recognizes in grouping set(a,b,(c,b)) row(c,b),
+ * so is necessary to transform to GroupingSetsSpec
+ */
+ ListCell *l;
+ RowExpr *rexpr = (RowExpr *) expanded_set;
+
+ foreach(l, rexpr->args)
+ if (IsEmptyGroupingSet((Node *)lfirst(l)))
+ {
+ has_empty_set = true;
+ break;
+ }
+
+ set_list = add_distinct_set(set_list, rexpr->args);
+ }
+ break;
+
+ default:
+ set_list = add_distinct_set(set_list, list_make1(expanded_set));
+ }
+ }
+ }
+ break;
+
+ default:
+ return node;
+ }
+
+ if (set_list == NIL)
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ result = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+ result->has_empty_set = has_empty_set;
+ result->set_list = set_list;
+ result->location = location;
+ /* this grouping set isn't empty */
+ *isemptyset = false;
+
+ return (Node *) result;
+ }
+
+ /* multiple two nodes:
+ * GroupingSetsSpec x GroupingSetsSpec
+ * Node * Node
+ * Node * GroupingSetsSpec
+ * GroupingSetsSpec * Node
+ */
+ static Node *
+ multiple(Node *a, Node *b)
+ {
+ ListCell *l;
+ ListCell *lj;
+ List *set_list = NIL;
+ GroupingSetsSpec *result;
+ GroupingSetsSpec *gss_a;
+ GroupingSetsSpec *gss_b;
+
+ /* leave when nodes are equal */
+ if (equal(a, b))
+ return a;
+
+ if (!IsGroupingSetsSpec(a) && !IsGroupingSetsSpec(b))
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+ gss->set_list = list_make1(list_make2(a,b));
+ gss->has_empty_set = false;
+ gss->location = exprLocation(a);
+
+ return (Node *) gss;
+ }
+ if (IsGroupingSetsSpec(a) && !IsGroupingSetsSpec(b))
+ {
+ Node *aux = a;
+
+ a = b; b = aux;
+ }
+ if (!IsGroupingSetsSpec(a) && IsGroupingSetsSpec(b))
+ {
+ gss_b = (GroupingSetsSpec *) b;
+
+ foreach(l, gss_b->set_list)
+ {
+ List *set = (List *) lfirst(l);
+
+ Assert(IsA(lfirst(l), List));
+
+ /* add item to all grouping sets */
+ set = list_append_unique(set, a);
+ set_list = add_distinct_set(set_list, set);
+ }
+
+ if (gss_b->has_empty_set)
+ {
+ set_list = add_distinct_set(set_list, (list_make1(a)));
+ gss_b->has_empty_set = false;
+ }
+
+ gss_b->set_list = set_list;
+
+ return (Node *) b;
+ }
+
+ /*
+ * ((A,B),(C)) * ((X,Y),()) = ((A,B,X,Y),(A,B),(C,X,Y),(C))
+ */
+ Assert(IsGroupingSetsSpec(a) && IsGroupingSetsSpec(b));
+
+ gss_a = (GroupingSetsSpec *) a;
+ gss_b = (GroupingSetsSpec *) b;
+
+ result = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+
+ foreach(l, gss_a->set_list)
+ {
+ foreach(lj, gss_b->set_list)
+ set_list = add_distinct_set(set_list,
+ list_union((List *) lfirst(l),
+ (List *) lfirst(lj)));
+ if (gss_b->has_empty_set)
+ set_list = add_distinct_set(set_list,
+ (List *) lfirst(l));
+ }
+
+ if (gss_a->has_empty_set)
+ foreach(l, gss_b->set_list)
+ set_list = add_distinct_set(set_list, (List *) lfirst(l));
+
+ result->set_list = set_list;
+ result->has_empty_set = gss_a->has_empty_set && gss_b->has_empty_set;
+ result->location = gss_a->location;
+
+ return (Node *) result;
+ }
+
+ /*
+ * multiply elements of list -> (a1,a2,a2,a4) => (((a1 x a2) x a3) x a4)
+ */
+ static Node *
+ set_multiplication(List *list, bool has_empty_set)
+ {
+ int nfields = list_length(list);
+ ListCell *lc;
+ bool first_iter = true;
+ Node *stack = NULL;
+
+ foreach(lc, list)
+ {
+ Node *el = (Node *) lfirst(lc);
+
+ /* Propagate info about empty set to first item */
+ if (first_iter)
+ {
+ if (has_empty_set)
+ {
+ if (IsGroupingSetsSpec(el))
+ {
+ ((GroupingSetsSpec *) el)->has_empty_set = true;
+ /* leave when list has only one field */
+ if (nfields == 1)
+ return el;
+
+ stack = el;
+ }
+ else
+ {
+ GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
+ gss->set_list = list_make1(list_make1(el));
+ gss->has_empty_set = true;
+ gss->location = exprLocation(el);
+
+ stack = (Node *) gss;
+
+ /* leave when list has only one field */
+ if (nfields == 1)
+ return stack;
+ }
+ }
+ else
+ stack = el;
+
+ first_iter = false;
+ /* go to for next item */
+ continue;
+ }
+
+ Assert(stack != NULL);
+ stack = multiple(stack, el);
+ }
+
+ return stack;
+ }
+
+ /*
+ * This procedure prepare global targetList, local targetLists and local
+ * groupClauses.
+ */
+ List *
+ transformGroupingSetsSpec(ParseState *pstate, List *groupClause,
+ List **groupClauses, List **targetLists,
+ List **targetList, List *sortClause)
+ {
+ List *globalTL = copyObject((Node *)*targetList);
+ List *globalGC = NIL;
+ List *gpcs = NIL;
+ List *tls = NIL;
+ ListCell *gsc;
+ GroupingSetsSpec *gsspec = (GroupingSetsSpec *) linitial(groupClause);
+
+ *groupClauses = gpcs;
+ *targetLists = tls;
+ *targetList = globalTL;
+
+ return globalGC;
+ }
+
*** ./src/backend/parser/parse_target.c.orig 2009-05-04 21:16:40.000000000 +0200
--- ./src/backend/parser/parse_target.c 2009-05-04 21:22:30.000000000 +0200
***************
*** 1446,1451 ****
--- 1446,1467 ----
case T_XmlSerialize:
*name = "xmlserialize";
return 2;
+ case T_GroupingSetsFunc:
+ switch (((GroupingSetsFunc *) node)->identity)
+ {
+ case FUNCTION_GROUPING:
+ *name = "grouping";
+ return 2;
+ case FUNCTION_GROUPING_ID:
+ *name = "grouping_id";
+ return 2;
+ case FUNCTION_CUBE: /* by compiler quite */
+ case FUNCTION_ROLLUP:
+ /* nothing */
+ break;
+ }
+ break;
+
default:
break;
}
*** ./src/include/nodes/nodes.h.orig 2009-05-04 19:32:41.000000000 +0200
--- ./src/include/nodes/nodes.h 2009-05-13 11:33:34.000000000 +0200
***************
*** 376,381 ****
--- 376,385 ----
T_XmlSerialize,
T_WithClause,
T_CommonTableExpr,
+ T_GroupingSetsFunc,
+ T_GroupingSetsSpec,
+ T_EmptySet,
+ T_GroupByClause,
/*
* TAGS FOR RANDOM OTHER STUFF
*** ./src/include/nodes/parsenodes.h.orig 2009-05-04 19:15:28.000000000 +0200
--- ./src/include/nodes/parsenodes.h 2009-05-13 17:46:08.000000000 +0200
***************
*** 374,379 ****
--- 374,406 ----
} SortBy;
/*
+ * GroupingSetsSpec - for GROUP BY GROUPING SETS clause
+ */
+ typedef struct GroupingSetsSpec
+ {
+ NodeTag type;
+ List *set_list;
+ bool has_empty_set; /* true when grouping sets contains empty set */
+ int location;
+ } GroupingSetsSpec;
+
+ typedef struct EmptySet
+ {
+ NodeTag type;
+ } EmptySet;
+
+ /*
+ * GroupByClause for GROUP BY clause
+ */
+ typedef struct GroupByClause
+ {
+ NodeTag type;
+ bool all;
+ List *fields;
+ int location;
+ } GroupByClause;
+
+ /*
* WindowDef - raw representation of WINDOW and OVER clauses
*
* For entries in a WINDOW list, "name" is the window name being defined.
***************
*** 416,421 ****
--- 443,468 ----
FRAMEOPTION_END_CURRENT_ROW)
/*
+ * GroupingSetsFunc - parser node for Grouping, Grouping_id, Cube and Rollup quasy functions
+ */
+ typedef enum GroupingSetsFuncIdentity
+ {
+ FUNCTION_GROUPING,
+ FUNCTION_GROUPING_ID,
+ FUNCTION_CUBE,
+ FUNCTION_ROLLUP
+ } GroupingSetsFuncIdentity;
+
+ typedef struct GroupingSetsFunc
+ {
+ NodeTag type;
+ GroupingSetsFuncIdentity identity;
+ Node *expr;
+ List *expr_list;
+ int location;
+ } GroupingSetsFunc;
+
+ /*
* RangeSubselect - subquery appearing in a FROM clause
*/
typedef struct RangeSubselect
***************
*** 939,945 ****
List *targetList; /* the target list (of ResTarget) */
List *fromClause; /* the FROM clause */
Node *whereClause; /* WHERE qualification */
! List *groupClause; /* GROUP BY clauses */
Node *havingClause; /* HAVING conditional-expression */
List *windowClause; /* WINDOW window_name AS (...), ... */
WithClause *withClause; /* WITH clause */
--- 986,992 ----
List *targetList; /* the target list (of ResTarget) */
List *fromClause; /* the FROM clause */
Node *whereClause; /* WHERE qualification */
! Node *groupClause; /* GROUP BY clauses */
Node *havingClause; /* HAVING conditional-expression */
List *windowClause; /* WINDOW window_name AS (...), ... */
WithClause *withClause; /* WITH clause */
*** ./src/include/parser/kwlist.h.orig 2009-05-04 18:08:32.000000000 +0200
--- ./src/include/parser/kwlist.h 2009-05-05 14:44:39.000000000 +0200
***************
*** 98,103 ****
--- 98,104 ----
PG_KEYWORD("createuser", CREATEUSER, UNRESERVED_KEYWORD)
PG_KEYWORD("cross", CROSS, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("csv", CSV, UNRESERVED_KEYWORD)
+ PG_KEYWORD("cube", CUBE, RESERVED_KEYWORD)
PG_KEYWORD("current", CURRENT_P, UNRESERVED_KEYWORD)
PG_KEYWORD("current_catalog", CURRENT_CATALOG, RESERVED_KEYWORD)
PG_KEYWORD("current_date", CURRENT_DATE, RESERVED_KEYWORD)
***************
*** 168,173 ****
--- 169,176 ----
PG_KEYWORD("granted", GRANTED, UNRESERVED_KEYWORD)
PG_KEYWORD("greatest", GREATEST, COL_NAME_KEYWORD)
PG_KEYWORD("group", GROUP_P, RESERVED_KEYWORD)
+ PG_KEYWORD("grouping", GROUPING, RESERVED_KEYWORD)
+ PG_KEYWORD("grouping_id", GROUPING_ID, COL_NAME_KEYWORD)
PG_KEYWORD("handler", HANDLER, UNRESERVED_KEYWORD)
PG_KEYWORD("having", HAVING, RESERVED_KEYWORD)
PG_KEYWORD("header", HEADER_P, UNRESERVED_KEYWORD)
***************
*** 317,322 ****
--- 320,326 ----
PG_KEYWORD("right", RIGHT, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("role", ROLE, UNRESERVED_KEYWORD)
PG_KEYWORD("rollback", ROLLBACK, UNRESERVED_KEYWORD)
+ PG_KEYWORD("rollup", ROLLUP, RESERVED_KEYWORD)
PG_KEYWORD("row", ROW, COL_NAME_KEYWORD)
PG_KEYWORD("rows", ROWS, UNRESERVED_KEYWORD)
PG_KEYWORD("rule", RULE, UNRESERVED_KEYWORD)
***************
*** 334,339 ****
--- 338,344 ----
PG_KEYWORD("session_user", SESSION_USER, RESERVED_KEYWORD)
PG_KEYWORD("set", SET, UNRESERVED_KEYWORD)
PG_KEYWORD("setof", SETOF, COL_NAME_KEYWORD)
+ PG_KEYWORD("sets", SETS, UNRESERVED_KEYWORD)
PG_KEYWORD("share", SHARE, UNRESERVED_KEYWORD)
PG_KEYWORD("show", SHOW, UNRESERVED_KEYWORD)
PG_KEYWORD("similar", SIMILAR, TYPE_FUNC_NAME_KEYWORD)
*** ./src/include/parser/parse_gsets.h.orig 2009-05-04 21:41:52.000000000 +0200
--- ./src/include/parser/parse_gsets.h 2009-05-11 11:22:57.000000000 +0200
***************
*** 0 ****
--- 1,14 ----
+ #include "parser/parse_node.h"
+ #include "nodes/pg_list.h"
+
+
+ #ifndef PARSE_GETS_H
+ #define PARSE_GETS_H
+
+ Node *expandGroupingSets(ParseState *pstate, List *grouplist);
+ List *transformGroupingSetsSpec(ParseState *pstate, List *groupClause,
+ List **groupClauses, List **targetLists,
+ List **targetList, List *sortClause);
+
+
+ #endif /* PARSE_GETS_H */
*** ./src/include/parser/parse_node.h.orig 2009-05-11 20:31:38.000000000 +0200
--- ./src/include/parser/parse_node.h 2009-05-11 20:32:05.000000000 +0200
***************
*** 101,106 ****
--- 101,107 ----
bool p_is_insert;
bool p_is_update;
Relation p_target_relation;
+ bool p_hasGroupingSets;
RangeTblEntry *p_target_rangetblentry;
} ParseState;
Hello,
here is last my grouping sets patch - next work will be done by Oleg.
This patch has full functionality of grouping sets - with support for
grouping via expr.
postgres=# select extract(year from a),a,b, sum(c) from foo group by
grouping sets(extract(year from a), a, b,()); date_part | a |
b | sum
-----------+------------+----+-----
2001 | | | 40
| 2001-10-12 | | 20
| 2001-10-11 | | 20
| | 10 | 40
| | | 40
(5 rows)
postgres=# select * from foo;
a | b | c
------------+----+----
2001-10-11 | 10 | 20
2001-10-12 | 10 | 20
(2 rows)
This patch is WIP - has full functionality, but is based on CTE - what
is mas/menos ugly hack. But for testing and first view about grouping
sets, it should do good work.
regards
Pavel Stehule
2009/5/10 Олег Царев <zabivator@gmail.com>:
Show quoted text
I will write separated mail about rollup.
Can you send me some links or information about "CTE"? What is it?
Also, I need some deep knownledge about postgresql aggregation
calculation (executor part) - can you help me (links, books, name of
source files, etc)?.
After get additional information i will can continue discussion.
Thanls
2009/5/10 Pavel Stehule <pavel.stehule@gmail.com>:2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello, Pavel.
I read you letter, and found some points:
a) Simple transformation any GROUPING SETS in GROUP BY + UNION ALL
require clone source for every group.
It's so slow.
My point: we can make for start simple implementation.
b) Your sollution it's so good, IMHO
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b
But can you describe me how it's used on planner (calculate cost) and exector.look on CTE source code or ask to CTE author.
Sharing source - it's mind - we don't process tree, we process
DIRECTED GRAPH. Hard, many bugs, so on, not?for example - code for DISTINCT is shared with code for aggregation.
My idea is based on similarityGROUPING SETS is subset of CTE, nonrecursive CTE is subset of UNION ALL
PostgreSQL support sharing nodes of executor? How? Where i can read?
I don't know, what do you thing exactly sharing nodes? Before CTE I
had to write special holder node, but it isn't necessary now.Next. Ok, we spool source. After we use simple hash group/sort group +
union all? Or more advanced techniques? If different group by - it's
not different from my idea, it's logic additional (spool source for
fast rewind)I thing so trivial implementation via UNION ALL should work, but any
optimisations means lot of work - and when you add rewind
functionality, you will got trivial nonrecursive CTE implementation.
When I wrote patch, there wasn't possible an use CTE, so I played wit
own nodes. Now, the more clean solution is probably based on current
CTE base.c) Don't forget also about node requiments - if we use hash table as
container for source - we reduce some sorting properties - and query
select A,B from TABLE group by ((A,B),(B)) order by a,b require
sorting on output of grouping sets.
It's mind - we need insert sort.Yes, probably there should be possible some techniques - that should
to eliminate final sort op. I am not sure if it is really important.
Now I thinking what is the bigger devil a) patch complexity b) some
suboptimality (mainly redundant call of agg nodes). For me a.
Aggregates are not gratis, but significantly more expensive is
repeated seq data scan. So it is first goal. Later we should to thing
about next optimalizations.d) We can make SORT ROLLUP and SORT REDUCE ROLLUP.
first - sort group by use sorting for grouping data and calculate aggregations.
So, 'order by A,B,C' also 'order by A,B' also order by 'A' also order by ''
What it's mind? It's mind we can use sort features for implement SORT
ROLLUP. For every sub group we calculate self aggregates... oh, i bad
say. Look on "select a,sum(b) from table group by a". Sort group
grouping by a, and for every a-group calculate aggregations (sum), for
ROLLUP A,B we can make three aggregations ; for A,B than A than () -
and use one path on source calculate aggregations on every step, and
for split source on different subsets.
It's more perfomance, than different group by and union all.Maybe, I don't know. What I know:
* UNION ALL will have better result on indexed sets and MIN, MAX
aggregates. CTE isn't optimized now for these cases.
* UNION ALL will be slow for large sets (over cache),
* CTE needs some more optimalizations,
* pg core hackers dislike big and complex patches,
* pg core hackers like any improved current code base.I am thinking so Grouping Sets based on CTE should be more commitable
code. It doesn't mean so your ideas are wrong, but these
optimalization should to work on CTE too.select * from table group by rollup(a,b,c)
have to have generate same plan as
with q as (select * from table)
select * from q group by a,b,c
union all
select * from q group by a,b
union all
select * from q group by a
union all
select * from q;and CTE is more general then Grouping Sets, so it is better do
optimalization over CTE than Grouping Sets.Regards
Pavel StehuleLook for MS SQL:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
Why MS SQL don't support distinct aggregations? I think - because
every distinct aggregations in MS SQL require hash, and many
aggregations - it's so slow.Thank you!
2009/5/10 Pavel Stehule <pavel.stehule@gmail.com>:Hello
some other info is on http://wiki.postgresql.org/wiki/Grouping_Sets
Maybe in e few weak I'll have some a prototype based on CTE. My older
technique based on hashed tables should be well, but it carry lot of
unshared code. Using CTE means so we can optimize CTE and GROUPING
SETS together. I plan to transform querySELECT * FROM some GROUP BY GROUPING SETS(a, b)
to
WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b2009/5/10 Олег Царев <zabivator@gmail.com>:
Hello all.
Please, approve my ideas for implementation.Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DOMS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspxMS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspxOracle support this feature:
http://www.compshack.com/sql/oracle-group-rollupSo, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).First. GROUPG SETS.
select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without groupingAlso have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).This is suboptimal. When SELECT * FROM X GROUP BY GROUPING SETS ...,
where X is some joined data, then you repeat JOIN on every grouping
set. So your solution is simple for implementation, but it should be
really slow.Regards
Pavel StehuleIn future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attachments:
gsets-0.6.difftext/x-patch; charset=US-ASCII; name=gsets-0.6.diffDownload
*** ./src/backend/nodes/copyfuncs.c.orig 2009-05-04 19:54:52.000000000 +0200
--- ./src/backend/nodes/copyfuncs.c 2009-05-13 18:27:38.000000000 +0200
***************
*** 1056,1061 ****
--- 1056,1118 ----
}
/*
+ * _copyGroupingSetsFunc
+ */
+ static GroupingSetsFunc *
+ _copyGroupingSetsFunc(GroupingSetsFunc *from)
+ {
+ GroupingSetsFunc *newnode = makeNode(GroupingSetsFunc);
+
+ COPY_SCALAR_FIELD(identity);
+ COPY_NODE_FIELD(expr);
+ COPY_NODE_FIELD(expr_list);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
+ * _copyGroupingSetsSpec
+ */
+ static GroupingSetsSpec *
+ _copyGroupingSetsSpec(GroupingSetsSpec *from)
+ {
+ GroupingSetsSpec *newnode = makeNode(GroupingSetsSpec);
+
+ COPY_NODE_FIELD(set_list);
+ COPY_SCALAR_FIELD(has_empty_set);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
+ * _copyEmptySet
+ */
+ static EmptySet *
+ _copyEmptySet(EmptySet *from)
+ {
+ EmptySet *newnode = makeNode(EmptySet);
+
+ return newnode;
+ }
+
+ /*
+ *
+ */
+ static GroupByClause *
+ _copyGroupByClause(GroupByClause *from)
+ {
+ GroupByClause *newnode = makeNode(GroupByClause);
+
+ COPY_SCALAR_FIELD(all);
+ COPY_NODE_FIELD(fields);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+ }
+
+ /*
* _copyScalarArrayOpExpr
*/
static ScalarArrayOpExpr *
***************
*** 4093,4098 ****
--- 4150,4167 ----
case T_XmlSerialize:
retval = _copyXmlSerialize(from);
break;
+ case T_GroupingSetsFunc:
+ retval = _copyGroupingSetsFunc(from);
+ break;
+ case T_GroupingSetsSpec:
+ retval = _copyGroupingSetsSpec(from);
+ break;
+ case T_EmptySet:
+ retval = _copyEmptySet(from);
+ break;
+ case T_GroupByClause:
+ retval = _copyGroupByClause(from);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
*** ./src/backend/nodes/equalfuncs.c.orig 2009-05-04 20:01:09.000000000 +0200
--- ./src/backend/nodes/equalfuncs.c 2009-05-13 17:42:23.000000000 +0200
***************
*** 290,295 ****
--- 290,332 ----
}
static bool
+ _equalGroupingSetsFunc(GroupingSetsFunc *a, GroupingSetsFunc *b)
+ {
+ COMPARE_SCALAR_FIELD(identity);
+ COMPARE_NODE_FIELD(expr);
+ COMPARE_NODE_FIELD(expr_list);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
+ _equalGroupingSetsSpec(GroupingSetsSpec *a, GroupingSetsSpec *b)
+ {
+ COMPARE_SCALAR_FIELD(set_list);
+ COMPARE_SCALAR_FIELD(has_empty_set);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
+ _equalEmptySet(EmptySet *a, EmptySet *b)
+ {
+ return true;
+ }
+
+ static bool
+ _equalGroupByClause(GroupByClause *a, GroupByClause *b)
+ {
+ COMPARE_SCALAR_FIELD(all);
+ COMPARE_NODE_FIELD(fields);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+ }
+
+ static bool
_equalScalarArrayOpExpr(ScalarArrayOpExpr *a, ScalarArrayOpExpr *b)
{
COMPARE_SCALAR_FIELD(opno);
***************
*** 2871,2876 ****
--- 2908,2925 ----
case T_XmlSerialize:
retval = _equalXmlSerialize(a, b);
break;
+ case T_GroupingSetsFunc:
+ retval = _equalGroupingSetsFunc(a, b);
+ break;
+ case T_GroupingSetsSpec:
+ retval = _equalGroupingSetsSpec(a, b);
+ break;
+ case T_EmptySet:
+ retval = _equalEmptySet(a, b);
+ break;
+ case T_GroupByClause:
+ retval = _equalGroupByClause(a, b);
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
*** ./src/backend/nodes/nodeFuncs.c.orig 2009-05-04 21:56:15.000000000 +0200
--- ./src/backend/nodes/nodeFuncs.c 2009-05-05 14:05:41.000000000 +0200
***************
*** 917,922 ****
--- 917,928 ----
/* just use argument's location */
loc = exprLocation((Node *) ((PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupingSetsFunc:
+ loc = ((GroupingSetsFunc *) expr)->location;
+ break;
+ case T_GroupingSetsSpec:
+ loc = ((GroupingSetsSpec *) expr)->location;
+ break;
default:
/* for any other node type it's just unknown... */
loc = -1;
***************
*** 1342,1347 ****
--- 1348,1368 ----
break;
case T_PlaceHolderInfo:
return walker(((PlaceHolderInfo *) node)->ph_var, context);
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+
+ if (expression_tree_walker(gsf->expr, walker, context))
+ return true;
+ if (expression_tree_walker((Node *) gsf->expr_list, walker, context))
+ return true;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ if (expression_tree_walker((Node *)((GroupingSetsSpec *) node)->set_list,
+ walker, context))
+ return true;
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
***************
*** 2016,2021 ****
--- 2037,2063 ----
return (Node *) newnode;
}
break;
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ GroupingSetsFunc *newnode;
+
+ FLATCOPY(newnode, gsf, GroupingSetsFunc);
+ MUTATE(newnode->expr, gsf->expr, Node *);
+ MUTATE(newnode->expr_list, gsf->expr, List *);
+ return (Node *) newnode;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) node;
+ GroupingSetsSpec *newnode;
+
+ FLATCOPY(newnode, gss, GroupingSetsSpec);
+ MUTATE(newnode->set_list, gss->set_list, List *);
+ return (Node *) newnode;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
*** ./src/backend/nodes/outfuncs.c.orig 2009-05-04 20:04:05.000000000 +0200
--- ./src/backend/nodes/outfuncs.c 2009-05-13 17:42:33.000000000 +0200
***************
*** 902,907 ****
--- 902,943 ----
}
static void
+ _outGroupingSetsFunc(StringInfo str, GroupingSetsFunc *node)
+ {
+ WRITE_NODE_TYPE("GROUPINGSETSFUNC");
+
+ WRITE_ENUM_FIELD(identity, GroupingSetsFuncIdentity);
+ WRITE_NODE_FIELD(expr);
+ WRITE_NODE_FIELD(expr_list);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
+ _outGroupingSetsSpec(StringInfo str, GroupingSetsSpec *node)
+ {
+ WRITE_NODE_TYPE("GROUPINGSETSSPEC");
+
+ WRITE_NODE_FIELD(set_list);
+ WRITE_BOOL_FIELD(has_empty_set);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
+ _outEmptySet(StringInfo str, EmptySet *node)
+ {
+ WRITE_NODE_TYPE("EMPTYSET");
+ }
+
+ static void
+ _outGroupByClause(StringInfo str, GroupByClause *node)
+ {
+ WRITE_NODE_TYPE("GROUPBYCLAUSE");
+ WRITE_BOOL_FIELD(all);
+ WRITE_NODE_FIELD(fields);
+ WRITE_LOCATION_FIELD(location);
+ }
+
+ static void
_outScalarArrayOpExpr(StringInfo str, ScalarArrayOpExpr *node)
{
WRITE_NODE_TYPE("SCALARARRAYOPEXPR");
***************
*** 2774,2779 ****
--- 2810,2827 ----
case T_XmlSerialize:
_outXmlSerialize(str, obj);
break;
+ case T_GroupingSetsFunc:
+ _outGroupingSetsFunc(str, obj);
+ break;
+ case T_GroupingSetsSpec:
+ _outGroupingSetsSpec(str, obj);
+ break;
+ case T_EmptySet:
+ _outEmptySet(str, obj);
+ break;
+ case T_GroupByClause:
+ _outGroupByClause(str, obj);
+ break;
default:
*** ./src/backend/nodes/readfuncs.c.orig 2009-05-04 20:09:32.000000000 +0200
--- ./src/backend/nodes/readfuncs.c 2009-05-13 17:42:42.000000000 +0200
***************
*** 583,588 ****
--- 583,645 ----
}
/*
+ * _readGroupingSetsFunc
+ */
+ static GroupingSetsFunc *
+ _readGroupingSetsFunc(void)
+ {
+ READ_LOCALS(GroupingSetsFunc);
+
+ READ_ENUM_FIELD(identity, GroupingSetsFuncIdentity);
+ READ_NODE_FIELD(expr);
+ READ_NODE_FIELD(expr_list);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+ /*
+ * _readGroupingSetsSpec
+ */
+ static GroupingSetsSpec *
+ _readGroupingSetsSpec(void)
+ {
+ READ_LOCALS(GroupingSetsSpec);
+
+ READ_NODE_FIELD(set_list);
+ READ_BOOL_FIELD(has_empty_set);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+ /*
+ * _readEmptySet
+ */
+ static EmptySet *
+ _readEmptySet(void)
+ {
+ READ_LOCALS_NO_FIELDS(EmptySet);
+
+ READ_DONE();
+ }
+
+ /*
+ *_readGroupByClause
+ */
+ static GroupByClause *
+ _readGroupByClause(void)
+ {
+ READ_LOCALS(GroupByClause);
+
+ READ_BOOL_FIELD(all);
+ READ_NODE_FIELD(fields);
+ READ_LOCATION_FIELD(location);
+
+ READ_DONE();
+ }
+
+ /*
* _readScalarArrayOpExpr
*/
static ScalarArrayOpExpr *
***************
*** 1274,1279 ****
--- 1331,1344 ----
return_value = _readNotifyStmt();
else if (MATCH("DECLARECURSOR", 13))
return_value = _readDeclareCursorStmt();
+ else if (MATCH("GROUPINGSETSFUNC", 16))
+ return_value = _readGroupingSetsFunc();
+ else if (MATCH("GROUPINGSETSSPEC", 16))
+ return_value = _readGroupingSetsSpec();
+ else if (MATCH("EMPTYSET", 8))
+ return_value = _readEmptySet();
+ else if (MATCH("GROUPBYCLAUSE", 13))
+ return_value = _readGroupByClause();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
*** ./src/backend/parser/analyze.c.orig 2009-05-05 14:14:42.000000000 +0200
--- ./src/backend/parser/analyze.c 2009-05-17 21:29:33.000000000 +0200
***************
*** 34,39 ****
--- 34,40 ----
#include "parser/parse_clause.h"
#include "parser/parse_coerce.h"
#include "parser/parse_cte.h"
+ #include "parser/parse_gsets.h"
#include "parser/parse_oper.h"
#include "parser/parse_relation.h"
#include "parser/parse_target.h"
***************
*** 148,153 ****
--- 149,311 ----
}
/*
+ * process GROUPING SETS
+ */
+ static SelectStmt *
+ makeSelectStmt(List *targetList, List *fromClause)
+ {
+ SelectStmt *n = makeNode(SelectStmt);
+ n->distinctClause = NULL;
+ n->intoClause = NULL;
+ n->targetList = targetList;
+ n->fromClause = fromClause;
+ n->whereClause = NULL;
+ n->groupClause = NULL;
+ n->havingClause = NULL;
+ n->windowClause = NIL;
+ n->withClause = NULL;
+ n->valuesLists = NIL;
+ n->sortClause = NIL;
+ n->limitOffset = NULL;
+ n->limitCount = NULL;
+ n->lockingClause = NIL;
+ n->op = SETOP_NONE;
+ n->all = false;
+ n->larg = NULL;
+ n->rarg = NULL;
+ return n;
+ }
+
+ static List *
+ makeStarTargetList(void)
+ {
+ ResTarget *rt = makeNode(ResTarget);
+
+ rt->name = NULL;
+ rt->indirection = NIL;
+ rt->val = (Node *) makeNode(ColumnRef);
+ ((ColumnRef *) rt->val)->fields = list_make1(makeNode(A_Star));
+ rt->location = -1;
+
+ return list_make1(rt);
+ }
+
+ static SelectStmt *
+ transformGroupingSets(ParseState *pstate, SelectStmt *stmt)
+ {
+ if (stmt->groupClause && IsA(stmt->groupClause, GroupByClause))
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) expandGroupingSets(pstate,
+ (List *)((GroupByClause *)stmt->groupClause)->fields);
+
+ if (pstate->p_hasGroupingSets)
+ {
+ CommonTableExpr *cte = makeNode(CommonTableExpr);
+ SelectStmt *cteedstmt;
+ int ngroupingsets = list_length(gss->set_list) + (gss->has_empty_set ? 1 : 0);
+ bool all = ((GroupByClause *) stmt->groupClause)->all;
+
+ cteedstmt = makeSelectStmt(NIL, NIL);
+ cteedstmt->intoClause = stmt->intoClause;
+ cteedstmt->sortClause = stmt->sortClause;
+ cteedstmt->limitOffset = stmt->limitOffset;
+ cteedstmt->limitCount = stmt->limitCount;
+ cteedstmt->lockingClause = stmt->lockingClause;
+
+ cte->ctename = "**g**";
+ cte->ctequery = (Node *) stmt;
+ cte->location = -1;
+
+ cteedstmt->withClause = makeNode(WithClause);
+ cteedstmt->withClause->ctes = list_make1(cte);
+ cteedstmt->withClause->recursive = false;
+ cteedstmt->withClause->location = -1;
+
+ /* when is more than one grouping set, then we should generate setop node */
+ if (ngroupingsets > 1)
+ {
+ /* add quuery under union all for every grouping set */
+ SelectStmt *larg = NULL;
+ SelectStmt *rarg;
+ ListCell *lc;
+
+ foreach(lc, gss->set_list)
+ {
+ List *groupClause;
+
+ Assert(IsA(lfirst(lc), List));
+ groupClause = (List *) lfirst(lc);
+
+ if (larg == NULL)
+ {
+ larg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ larg->groupClause = (Node *) groupClause;
+ larg->havingClause = copyObject(stmt->havingClause);
+ }
+ else
+ {
+ SelectStmt *setop = makeSelectStmt(NIL, NIL);
+
+ rarg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ rarg->groupClause = (Node *) groupClause;
+ rarg->havingClause = copyObject(stmt->havingClause);
+
+ setop->op = SETOP_UNION;
+ setop->larg = larg;
+ setop->rarg = rarg;
+ setop->all = all;
+
+ larg = setop;
+ }
+ }
+ if (gss->has_empty_set)
+ {
+ SelectStmt *setop = makeSelectStmt(NIL, NIL);
+
+ rarg = makeSelectStmt(copyObject(stmt->targetList),
+ list_make1(makeRangeVar(NULL, "**g**", -1)));
+ rarg->havingClause = copyObject(stmt->havingClause);
+
+ setop->op = SETOP_UNION;
+ setop->larg = larg;
+ setop->rarg = rarg;
+ setop->all = all;
+
+ larg = setop;
+ }
+ /* merge larg to result */
+ cteedstmt->op = larg->op;
+ cteedstmt->larg = larg->larg;
+ cteedstmt->rarg = larg->rarg;
+ cteedstmt->all = larg->all;
+ }
+ else
+ {
+ /* there isn't used setop node */
+ cteedstmt->targetList = copyObject(stmt->targetList);
+ cteedstmt->fromClause = list_make1(makeRangeVar(NULL, "**g**", -1));
+ }
+
+ ((SelectStmt *)cte->ctequery)->targetList = makeStarTargetList();
+ ((SelectStmt *)cte->ctequery)->groupClause = NULL;
+ ((SelectStmt *)cte->ctequery)->sortClause = NIL;
+ ((SelectStmt *)cte->ctequery)->limitOffset = stmt->limitOffset;
+ ((SelectStmt *)cte->ctequery)->limitCount = stmt->limitCount;
+ ((SelectStmt *)cte->ctequery)->lockingClause = stmt->lockingClause;
+
+ return cteedstmt;
+ }
+ else
+ /* trim GroupByClause to groupByClause */
+ stmt->groupClause = (Node *)((GroupByClause *)stmt->groupClause)->fields;
+ }
+
+ return stmt;
+ }
+
+ /*
* transformStmt -
* transform a Parse tree into a Query tree.
*/
***************
*** 176,181 ****
--- 334,341 ----
case T_SelectStmt:
{
SelectStmt *n = (SelectStmt *) parseTree;
+
+ n = transformGroupingSets(pstate, n);
if (n->valuesLists)
result = transformValuesClause(pstate, n);
***************
*** 826,835 ****
true /* fix unknowns */ );
qry->groupClause = transformGroupClause(pstate,
! stmt->groupClause,
&qry->targetList,
qry->sortClause,
! false);
if (stmt->distinctClause == NIL)
{
--- 986,995 ----
true /* fix unknowns */ );
qry->groupClause = transformGroupClause(pstate,
! (List *) stmt->groupClause,
&qry->targetList,
qry->sortClause,
! false);
if (stmt->distinctClause == NIL)
{
***************
*** 921,927 ****
Assert(stmt->targetList == NIL);
Assert(stmt->fromClause == NIL);
Assert(stmt->whereClause == NULL);
! Assert(stmt->groupClause == NIL);
Assert(stmt->havingClause == NULL);
Assert(stmt->windowClause == NIL);
Assert(stmt->op == SETOP_NONE);
--- 1081,1087 ----
Assert(stmt->targetList == NIL);
Assert(stmt->fromClause == NIL);
Assert(stmt->whereClause == NULL);
! Assert(stmt->groupClause == NULL);
Assert(stmt->havingClause == NULL);
Assert(stmt->windowClause == NIL);
Assert(stmt->op == SETOP_NONE);
*** ./src/backend/parser/gram.y.orig 2009-05-04 18:07:44.000000000 +0200
--- ./src/backend/parser/gram.y 2009-05-13 17:43:23.000000000 +0200
***************
*** 118,123 ****
--- 118,125 ----
static Node *makeNullAConst(int location);
static Node *makeAConst(Value *v, int location);
static Node *makeBoolAConst(bool state, int location);
+ static Node *makeGroupingSetsFunc(GroupingSetsFuncIdentity identity, Node *expr,
+ List *expr_list, int location);
static FuncCall *makeOverlaps(List *largs, List *rargs, int location);
static void check_qualified_name(List *names);
static List *check_func_name(List *names);
***************
*** 284,290 ****
target_list insert_column_list set_target_list
set_clause_list set_clause multiple_set_clause
ctext_expr_list ctext_row def_list indirection opt_indirection
! reloption_list group_clause TriggerFuncArgs select_limit
opt_select_limit opclass_item_list opclass_drop_list
opt_opfamily transaction_mode_list_or_empty
TableFuncElementList opt_type_modifiers
--- 286,292 ----
target_list insert_column_list set_target_list
set_clause_list set_clause multiple_set_clause
ctext_expr_list ctext_row def_list indirection opt_indirection
! reloption_list TriggerFuncArgs select_limit
opt_select_limit opclass_item_list opclass_drop_list
opt_opfamily transaction_mode_list_or_empty
TableFuncElementList opt_type_modifiers
***************
*** 416,421 ****
--- 418,427 ----
%type <str> opt_existing_window_name
%type <ival> opt_frame_clause frame_extent frame_bound
+ %type <node> grouping_element empty_grouping_set grouping_sets_spec
+ group_clause
+ %type <list> grouping_element_list
+ %type <boolean> opt_grouping_set_quantifier
/*
* If you make any token changes, update the keyword table in
***************
*** 438,444 ****
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
CREATEROLE CREATEUSER CROSS CSV CURRENT_P
! CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
--- 444,450 ----
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
CREATEROLE CREATEUSER CROSS CSV CURRENT_P
! CUBE CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
***************
*** 451,457 ****
FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOLLOWING FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
! GLOBAL GRANT GRANTED GREATEST GROUP_P
HANDLER HAVING HEADER_P HOLD HOUR_P
--- 457,463 ----
FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOLLOWING FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
! GLOBAL GRANT GRANTED GREATEST GROUP_P GROUPING GROUPING_ID
HANDLER HAVING HEADER_P HOLD HOUR_P
***************
*** 485,494 ****
RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
! RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
! SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SHARE
SHOW SIMILAR SIMPLE SMALLINT SOME STABLE STANDALONE_P START STATEMENT
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING SUPERUSER_P
SYMMETRIC SYSID SYSTEM_P
--- 491,500 ----
RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
! RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
! SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SETS SHARE
SHOW SIMILAR SIMPLE SMALLINT SOME STABLE STANDALONE_P START STATEMENT
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING SUPERUSER_P
SYMMETRIC SYSID SYSTEM_P
***************
*** 7227,7234 ****
;
group_clause:
! GROUP_P BY expr_list { $$ = $3; }
! | /*EMPTY*/ { $$ = NIL; }
;
having_clause:
--- 7233,7286 ----
;
group_clause:
! GROUP_P BY opt_grouping_set_quantifier grouping_element_list
! {
! GroupByClause *clause = makeNode(GroupByClause);
! clause->all = $3;
! clause->fields = $4;
! clause->location = @1;
! $$ = (Node *) clause;
! }
! | /*EMPTY*/
! {
! $$ = NULL;
! }
! ;
!
! grouping_sets_spec:
! GROUPING SETS '(' grouping_element_list ')'
! {
! /* We cannot identify and drop empty sets yet. */
! GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
! gss->set_list = $4;
! gss->has_empty_set = false;
! gss->location = @1;
! $$ = (Node *) gss;
! }
! ;
!
! grouping_element_list:
! grouping_element { $$ = list_make1($1); }
! | grouping_element_list ',' grouping_element { $$ = lappend($1, $3); }
! ;
!
! grouping_element:
! a_expr { $$ = $1; }
! | grouping_sets_spec { $$ = $1; }
! | empty_grouping_set { $$ = $1; }
! ;
!
! empty_grouping_set:
! '(' ')'
! {
! $$ = makeNode(EmptySet);
! }
! ;
!
! opt_grouping_set_quantifier:
! ALL { $$ = true; }
! | DISTINCT { $$ = false; }
! | /*EMPTY*/ { $$ = true; }
;
having_clause:
***************
*** 9266,9271 ****
--- 9318,9346 ----
n->location = @1;
$$ = (Node *)n;
}
+ | GROUPING '(' a_expr ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_GROUPING, $3, NIL, @1);
+ }
+ | GROUPING_ID '(' expr_list ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_GROUPING_ID, NULL, $3, @1);
+ }
+ | CUBE '(' expr_list ')'
+ {
+ /*
+ * Cube should be used in two different contexts. First,
+ * as part of Grouping Sets specification. Second, as
+ * normal UDF function from contrib cube module. When isnot
+ * grouping sets context, then node is transformated to
+ * FuncCall node later.
+ */
+ $$ = makeGroupingSetsFunc(FUNCTION_CUBE, NULL, $3, @1);
+ }
+ | ROLLUP '(' expr_list ')'
+ {
+ $$ = makeGroupingSetsFunc(FUNCTION_ROLLUP, NULL, $3, @1);
+ }
;
/*
***************
*** 10320,10325 ****
--- 10395,10401 ----
| SERVER
| SESSION
| SET
+ | SETS
| SHARE
| SHOW
| SIMPLE
***************
*** 10395,10400 ****
--- 10471,10477 ----
| EXTRACT
| FLOAT_P
| GREATEST
+ | GROUPING_ID
| INOUT
| INT_P
| INTEGER
***************
*** 10489,10494 ****
--- 10566,10572 ----
| COLUMN
| CONSTRAINT
| CREATE
+ | CUBE
| CURRENT_CATALOG
| CURRENT_DATE
| CURRENT_ROLE
***************
*** 10510,10515 ****
--- 10588,10594 ----
| FROM
| GRANT
| GROUP_P
+ | GROUPING
| HAVING
| IN_P
| INITIALLY
***************
*** 10533,10538 ****
--- 10612,10618 ----
| PRIMARY
| REFERENCES
| RETURNING
+ | ROLLUP
| SELECT
| SESSION_USER
| SOME
***************
*** 11047,11052 ****
--- 11127,11144 ----
return (Node *) x;
}
+ static Node *
+ makeGroupingSetsFunc(GroupingSetsFuncIdentity identity, Node *expr, List *expr_list, int location)
+ {
+ GroupingSetsFunc *gsfunc = makeNode(GroupingSetsFunc);
+
+ gsfunc->identity = identity;
+ gsfunc->expr = expr;
+ gsfunc->expr_list = expr_list;
+ gsfunc->location = location;
+ return (Node *) gsfunc;
+ }
+
/* parser_init()
* Initialize to parse one query string
*/
*** ./src/backend/parser/Makefile.orig 2009-05-04 21:23:31.000000000 +0200
--- ./src/backend/parser/Makefile 2009-05-04 21:24:06.000000000 +0200
***************
*** 14,20 ****
OBJS= analyze.o gram.o keywords.o parser.o parse_agg.o parse_cte.o parse_clause.o \
parse_expr.o parse_func.o parse_node.o parse_oper.o parse_relation.o \
! parse_type.o parse_coerce.o parse_target.o parse_utilcmd.o scansup.o kwlookup.o
FLEXFLAGS = -CF
--- 14,21 ----
OBJS= analyze.o gram.o keywords.o parser.o parse_agg.o parse_cte.o parse_clause.o \
parse_expr.o parse_func.o parse_node.o parse_oper.o parse_relation.o \
! parse_type.o parse_coerce.o parse_target.o parse_utilcmd.o scansup.o kwlookup.o \
! parse_gsets.o
FLEXFLAGS = -CF
*** ./src/backend/parser/parse_agg.c.orig 2009-05-11 21:12:59.000000000 +0200
--- ./src/backend/parser/parse_agg.c 2009-05-17 21:25:30.000000000 +0200
***************
*** 32,42 ****
--- 32,51 ----
int sublevels_up;
} check_ungrouped_columns_context;
+ typedef struct
+ {
+ ParseState *pstate;
+ List *groupClause;
+ } transform_ungrouped_target_context;
+
static void check_ungrouped_columns(Node *node, ParseState *pstate,
List *groupClauses, bool have_non_var_grouping);
static bool check_ungrouped_columns_walker(Node *node,
check_ungrouped_columns_context *context);
+ static Node * transform_ungrouped_target(Node *node, ParseState *pstate,
+ List *groupClauses);
+
/*
* transformAggregateCall -
***************
*** 318,324 ****
* WINDOW clauses. For that matter, it's also going to examine the
* grouping expressions themselves --- but they'll all pass the test ...
*/
! clause = (Node *) qry->targetList;
if (hasJoinRTEs)
clause = flatten_join_alias_vars(root, clause);
check_ungrouped_columns(clause, pstate,
--- 327,344 ----
* WINDOW clauses. For that matter, it's also going to examine the
* grouping expressions themselves --- but they'll all pass the test ...
*/
! if (pstate->parentParseState && pstate->parentParseState->p_hasGroupingSets)
! {
! clause = (Node *) transform_ungrouped_target((Node *) qry->targetList,
! pstate,
! groupClauses);
! /* HACK!!! - move to transform part*/
! qry->targetList = clause;
! }
! else
! clause = (Node *) qry->targetList;
!
!
if (hasJoinRTEs)
clause = flatten_join_alias_vars(root, clause);
check_ungrouped_columns(clause, pstate,
***************
*** 426,431 ****
--- 446,527 ----
}
/*
+ * transform_ungrouped_cols_mutator -
+ * All non aggregates non constatnt columns are replaced by NULL,
+ * grouping and grouping_id functions are replaced by constatnts.
+ */
+ static Node *
+ transform_ungrouped_target_mutator(Node *node,
+ transform_ungrouped_target_context *context)
+ {
+ if (node == NULL)
+ return NULL;
+ if (IsA(node, Aggref))
+ return node;
+
+ if (IsA(node, GroupingSetsFunc))
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ int result = 0;
+
+ if (gsf->identity == FUNCTION_GROUPING)
+ {
+ result = list_member(context->groupClause, gsf->expr) ? 0 : 1;
+ return (Node *) make_const(context->pstate, makeInteger(result), gsf->location);
+ }
+ else if (gsf->identity == FUNCTION_GROUPING_ID)
+ {
+ ListCell *el;
+
+ foreach(el, gsf->expr_list)
+ {
+ result = result << 1;
+ if (!list_member(context->groupClause, lfirst(el)))
+ result = result | 0x01;
+ }
+ return (Node *) make_const(context->pstate, makeInteger(result), gsf->location);
+ }
+ else /* replace Cube and Rollup by FuncCall node */
+ {
+ /* ToDo: Support cube function */
+ }
+ }
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ if (list_member(context->groupClause, node))
+ return node;
+ else
+ return (Node *) makeNullConst(var->vartype, var->vartypmod);
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ FuncExpr *fexpr = (FuncExpr *) node;
+
+ if (list_member(context->groupClause, node))
+ return node;
+ else
+ return (Node *) makeNullConst(fexpr->funcresulttype, -1);
+ }
+
+ return expression_tree_mutator(node,
+ transform_ungrouped_target_mutator, context);
+ }
+
+ static Node *
+ transform_ungrouped_target(Node *node, ParseState *pstate,
+ List *groupClauses)
+ {
+ transform_ungrouped_target_context context;
+
+ context.pstate = pstate;
+ context.groupClause = groupClauses;
+
+ return transform_ungrouped_target_mutator(node, &context);
+ }
+ /*
* check_ungrouped_columns -
* Scan the given expression tree for ungrouped variables (variables
* that are not listed in the groupClauses list and are not within
*** ./src/backend/parser/parse_expr.c.orig 2009-05-12 22:33:29.000000000 +0200
--- ./src/backend/parser/parse_expr.c 2009-05-13 18:41:54.000000000 +0200
***************
*** 273,278 ****
--- 273,296 ----
case T_CurrentOfExpr:
result = transformCurrentOfExpr(pstate, (CurrentOfExpr *) expr);
break;
+
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) expr;
+ ListCell *lc;
+ List *expr_list = NIL;
+
+ gsf->expr = (Node *) transformExpr(pstate, (Node *) gsf->expr);
+
+ foreach(lc, gsf->expr_list)
+ {
+ expr_list = lappend(expr_list, transformExpr(pstate,
+ (Node *) lfirst(lc)));
+ }
+ gsf->expr_list = expr_list;
+ result = expr;
+ break;
+ }
/*********************************************
* Quietly accept node types that may be presented when we are
***************
*** 304,309 ****
--- 322,328 ----
case T_CoerceToDomain:
case T_CoerceToDomainValue:
case T_SetToDefault:
+ case T_EmptySet:
{
result = (Node *) expr;
break;
*** ./src/backend/parser/parse_gsets.c.orig 2009-05-04 21:13:54.000000000 +0200
--- ./src/backend/parser/parse_gsets.c 2009-05-13 18:51:59.000000000 +0200
***************
*** 0 ****
--- 1,677 ----
+ /*-------------------------------------------------------------------------
+ *
+ * parse_gsets.c
+ * parser grouping sets transformations
+ *
+ * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * $PostgreSQL: $
+ *
+ *-------------------------------------------------------------------------
+ */
+ #include "postgres.h"
+ #include "catalog/namespace.h"
+ #include "nodes/nodes.h"
+ #include "nodes/makefuncs.h"
+ #include "nodes/nodeFuncs.h"
+ #include "nodes/value.h"
+ #include "nodes/pg_list.h"
+ #include "nodes/parsenodes.h"
+ #include "optimizer/tlist.h"
+ #include "parser/parse_clause.h"
+ #include "parser/parse_gsets.h"
+ #include "parser/parse_node.h"
+ #include "rewrite/rewriteManip.h"
+
+ #include <string.h>
+
+
+ #define UNDEF_LOCATION -1
+
+
+ #define ORDER_CLAUSE 0
+ #define GROUP_CLAUSE 1
+ #define DISTINCT_ON_CLAUSE 2
+
+ #define MAX_CUBE_FIELDS 8
+
+ #define IsGroupingSetsSpec(n) (IsA(n, GroupingSetsSpec))
+ #define IsUndefLoc(n) (n == UNDEF_LOCATION)
+ #define IsNIL(n) (n == NIL)
+
+ typedef struct
+ {
+ ParseState *pstate;
+ List *groupClause;
+ bool isGlobalTL;
+ } transform_ungroup_cols_context;
+
+ static List *add_distinct_set(List *gsets_list, List *gsets);
+ static List *add_distinct_sets(List *cum_gsets_list, List *gsets_list);
+ static Node *set_multiplication(List *list, bool has_empty_set);
+ static Node *expandGroupingSetsOperator(ParseState *pstate, Node *node, bool *isemptyset);
+
+
+ /*
+ * return true, when GroupingSet is empty
+ */
+ static bool
+ IsEmptyGroupingSet(Node *node)
+ {
+ if (node == NULL)
+ return true;
+ if (IsA(node, EmptySet))
+ return true;
+ if (IsA(node, GroupingSetsSpec))
+ return IsNIL(((GroupingSetsSpec *) node)->set_list);
+
+ return false;
+ }
+
+ /*
+ * expandGroupingSets do:
+ * a) search end replace cube and rollup funccall nodes by GroupingSetsSpec
+ * nodes,
+ * b) multiply all set in GroupBy list,
+ * c) drop all None nodes.
+ *
+ */
+ Node *
+ expandGroupingSets(ParseState *pstate, List *grouplist)
+ {
+ ListCell *g;
+ List *result = NIL;
+ bool has_empty_set = false;
+ int location = UNDEF_LOCATION;
+ bool is_grouping_sets = false;
+
+ /* leave fast when grouplist is NIL */
+ if (IsNIL(grouplist))
+ return NULL;
+
+ /* detect single group by grouping set */
+ foreach(g, grouplist)
+ {
+ Node *el = (Node *) lfirst(g);
+ Node *expel;
+ bool isemptyset;
+
+ /*
+ * Is it empty set. NULL is short cut for empty set.
+ * This has disadvantage - we lost location, but it is short.
+ */
+ if (IsEmptyGroupingSet(el))
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ /* take location from first item of list */
+ if (IsUndefLoc(location))
+ location = exprLocation(el);
+
+ expel = expandGroupingSetsOperator(pstate, el, &isemptyset);
+
+ if (isemptyset)
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ if (IsA(el, GroupingSetsSpec))
+ is_grouping_sets = true;
+ else
+ {
+ switch (((GroupingSetsFunc *) el)->identity)
+ {
+ case FUNCTION_CUBE:
+ case FUNCTION_ROLLUP:
+ is_grouping_sets = true;
+ break;
+ default:
+ is_grouping_sets = false;
+ }
+ }
+
+ result = lappend(result, expel);
+ }
+
+ /*
+ * when isn't detected grouping sets, but is any empty set
+ * do implicit Grouping Set
+ */
+ if (!is_grouping_sets)
+ {
+ if (has_empty_set)
+ {
+ GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
+ gss->set_list = result;
+ gss->has_empty_set = true;
+ gss->location = location;
+ pstate->p_hasGroupingSets = true;
+
+ return (Node *) gss;
+ }
+ else
+ /* when no trick from grouping sets spec is used, return original list */
+ return (Node *) grouplist;
+ }
+
+ /* multiple all sets in GROUP BY list */
+ pstate->p_hasGroupingSets = true;
+
+ return (Node *) set_multiplication(result, has_empty_set);
+ }
+
+ /*
+ * This function has some purposes - first take list from row expression,
+ * second - remove all empty sets and remove all duplicate sets,
+ * ToDo: add flag to RowExpr that carry info about explicit ROW keyword
+ * using.
+ */
+ static List **
+ adjustElements(List *elements, int *nfields, bool *has_empty_set)
+ {
+ ListCell *l;
+ List *felements = NIL; /* filtered elements, without empty sets */
+ int _nfields;
+ List **result;
+ int i = 0;
+
+ *has_empty_set = false;
+ foreach(l, elements)
+ {
+ if (IsEmptyGroupingSet((Node *) lfirst(l)))
+ {
+ *has_empty_set = true;
+ continue;
+ }
+
+ felements = list_append_unique(felements, lfirst(l));
+ }
+
+ if (IsNIL(felements))
+ return NULL;
+
+ _nfields = list_length(felements);
+ result = palloc(_nfields * sizeof(List *));
+
+ foreach(l, felements)
+ {
+ Node *el = (Node *) lfirst(l);
+
+ switch (nodeTag(el))
+ {
+ case T_RowExpr:
+ result[i++] = ((RowExpr *) el)->args;
+ break;
+ default:
+ result[i++] = list_make1(el);
+
+ }
+ }
+
+ *nfields = _nfields;
+ return result;
+ }
+
+ /*
+ * This function add one grouping set to other. Result is list of distinct sets,
+ * etc list of distinct list.
+ *
+ * Note: the distinct in function name is not related to ALL|DISTINCT quantifier
+ * of GROUP BY clause
+ */
+ static List *
+ add_distinct_set(List *gsets_list, List *gsets)
+ {
+
+ List *uniqset = NIL; /* unique gsets */
+ ListCell *l;
+ int act_len;
+ bool found = false;
+
+ /*
+ * ensure so all fields in gsets_list are unique,
+ * all new items are unique there.
+ */
+ foreach (l, gsets)
+ uniqset = list_append_unique(uniqset, lfirst(l));
+
+ /* same lists has same length */
+ act_len = list_length(uniqset);
+
+ foreach(l, gsets_list)
+ {
+ Node *node = (Node *) lfirst(l);
+
+ Assert(IsA(node, List));
+ if (list_length((List *) node) == act_len)
+ {
+ ListCell *lc;
+
+ found = true;
+ foreach(lc, (List *) node)
+ if (!list_member(uniqset, (Node *) lfirst(lc)))
+ {
+ found = false;
+ break;
+ }
+ if (found)
+ break;
+ }
+ }
+
+ if (!found)
+ return lappend(gsets_list, uniqset);
+ else
+ return gsets_list;
+ }
+
+ /*
+ * This used in grouping sets(a, rollup(a,b))
+ */
+ static List *
+ add_distinct_sets(List *cum_gsets_list, List *gsets_list)
+ {
+ ListCell *l;
+
+ foreach(l, gsets_list)
+ {
+ Node *el = (Node *) lfirst(l);
+ Assert(IsA(el, List));
+ cum_gsets_list = add_distinct_set(cum_gsets_list, (List *) el);
+ }
+ return cum_gsets_list;
+ }
+
+ /*
+ * Evaluate CUBE and ROLLUP operators
+ * Transform from GroupingSetsFunc to GroupingSetsSpec.
+ *
+ * Note: This routine doesn't necessary replace all GroupingSetsFunc nodes. It evaluate only
+ * GROUPING SETS context. There can be any nodes elsewhere - mainly Cube as FuncCall, so
+ * we have to use walker later and check it (and possible replace it).
+ */
+ static Node *
+ expandGroupingSetsOperator(ParseState *pstate, Node *node, bool *isemptyset)
+ {
+ bool has_empty_set = false;
+ List *set_list = NIL;
+ GroupingSetsSpec *result;
+ int location;
+
+ if (IsEmptyGroupingSet(node))
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ *isemptyset = false;
+
+ switch (nodeTag(node))
+ {
+ case T_GroupingSetsFunc:
+ {
+ GroupingSetsFunc *gsf = (GroupingSetsFunc *) node;
+ List **sets;
+ List *sets_union;
+ int nfields = list_length(gsf->expr_list);
+
+ if (gsf->identity == FUNCTION_CUBE && nfields > MAX_CUBE_FIELDS)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("more than eight fields in CUBE operator"),
+ parser_errposition(pstate, gsf->location)));
+
+ sets = adjustElements(gsf->expr_list, &nfields, &has_empty_set);
+ if (sets == NULL)
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ switch (gsf->identity)
+ {
+ case FUNCTION_CUBE:
+ {
+ /*
+ * Grouping Sets CUBE operator
+ * This operator evaluate set {a,b} to sets {{a,b},{a},{b},{}}
+ *
+ * An combinations are generated with bitmap's shift
+ * like
+ * 111 7 {a,b,c}
+ * 110 6 {a,b}
+ * 101 5 {a,c}
+ * 100 4 {a}
+ * 011 3 {b,c}
+ * 010 2 {b}
+ * 001 1 {c}
+ * 000 0 empty set
+ */
+ unsigned int btab[] = {0x01, 0x03, 0x07, 0x0f, 0x1f, 0x3f, 0x7f, 0xff};
+ unsigned int bitmap = btab[nfields - 1];
+ unsigned int masc = bitmap;
+ unsigned int selector = 1 << (nfields - 1);
+ int i;
+
+ sets_union = NIL;
+ while (bitmap > 0)
+ {
+ unsigned int processed_bitmap = bitmap--;
+
+ sets_union = NIL;
+ for (i = 0; processed_bitmap > 0; i++)
+ {
+ if (processed_bitmap & selector)
+ sets_union = list_union(sets_union, sets[i]);
+
+ processed_bitmap = (processed_bitmap << 1) & masc;
+ }
+
+ set_list = lappend(set_list, sets_union);
+ }
+ }
+ break;
+
+ case FUNCTION_ROLLUP:
+ {
+ /*
+ * Grouping Sets ROLLUP operator
+ * This operator evaluate set {a,b} to sets {{a,b},{a},{}}
+ */
+ int i;
+ int j;
+
+ set_list = NIL;
+ for (i = 0; i < nfields; i++)
+ {
+ sets_union = NIL;
+ for (j = 0; j < nfields - i; j++)
+ sets_union = list_union(sets_union, (List *) sets[j]);
+
+ set_list = lappend(set_list, sets_union);
+ }
+ }
+ break;
+
+ default:
+ /*
+ * none here, for FUNCTION_GROUPING and FUNCTION_GROUPING_ID
+ * return original node
+ */
+ pfree(sets);
+
+ return node;
+ }
+
+ pfree(sets);
+ location = gsf->location;
+ has_empty_set = true;
+ }
+ break;
+ case T_GroupingSetsSpec:
+ {
+ /*
+ * Grouping Sets (xxxx)
+ * a) call evaluation CUBE and ROLLUP operator
+ * b) ensure distinct sets
+ * c) find empty set inside
+ */
+ ListCell *gse;
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) node;
+
+ set_list = NIL;
+ location = gss->location;
+
+ foreach(gse, gss->set_list)
+ {
+ Node *grouping_element = (Node *) lfirst(gse);
+ Node *expanded_set = expandGroupingSetsOperator(pstate,
+ grouping_element, isemptyset);
+
+ if (*isemptyset)
+ {
+ has_empty_set = true;
+ continue;
+ }
+
+ switch (nodeTag(expanded_set))
+ {
+ case T_GroupingSetsSpec:
+ {
+ GroupingSetsSpec *egss = (GroupingSetsSpec *) expanded_set;
+
+ if (egss->has_empty_set)
+ has_empty_set = true;
+
+ if (egss->set_list != NIL)
+ set_list = add_distinct_sets(set_list, egss->set_list);
+ }
+ break;
+
+ case T_RowExpr:
+ {
+ /*
+ * parser recognizes in grouping set(a,b,(c,b)) row(c,b),
+ * so is necessary to transform to GroupingSetsSpec
+ */
+ ListCell *l;
+ RowExpr *rexpr = (RowExpr *) expanded_set;
+
+ foreach(l, rexpr->args)
+ if (IsEmptyGroupingSet((Node *)lfirst(l)))
+ {
+ has_empty_set = true;
+ break;
+ }
+
+ set_list = add_distinct_set(set_list, rexpr->args);
+ }
+ break;
+
+ default:
+ set_list = add_distinct_set(set_list, list_make1(expanded_set));
+ }
+ }
+ }
+ break;
+
+ default:
+ return node;
+ }
+
+ if (set_list == NIL)
+ {
+ *isemptyset = true;
+ return NULL;
+ }
+
+ result = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+ result->has_empty_set = has_empty_set;
+ result->set_list = set_list;
+ result->location = location;
+ /* this grouping set isn't empty */
+ *isemptyset = false;
+
+ return (Node *) result;
+ }
+
+ /* multiple two nodes:
+ * GroupingSetsSpec x GroupingSetsSpec
+ * Node * Node
+ * Node * GroupingSetsSpec
+ * GroupingSetsSpec * Node
+ */
+ static Node *
+ multiple(Node *a, Node *b)
+ {
+ ListCell *l;
+ ListCell *lj;
+ List *set_list = NIL;
+ GroupingSetsSpec *result;
+ GroupingSetsSpec *gss_a;
+ GroupingSetsSpec *gss_b;
+
+ /* leave when nodes are equal */
+ if (equal(a, b))
+ return a;
+
+ if (!IsGroupingSetsSpec(a) && !IsGroupingSetsSpec(b))
+ {
+ GroupingSetsSpec *gss = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+ gss->set_list = list_make1(list_make2(a,b));
+ gss->has_empty_set = false;
+ gss->location = exprLocation(a);
+
+ return (Node *) gss;
+ }
+ if (IsGroupingSetsSpec(a) && !IsGroupingSetsSpec(b))
+ {
+ Node *aux = a;
+
+ a = b; b = aux;
+ }
+ if (!IsGroupingSetsSpec(a) && IsGroupingSetsSpec(b))
+ {
+ gss_b = (GroupingSetsSpec *) b;
+
+ foreach(l, gss_b->set_list)
+ {
+ List *set = (List *) lfirst(l);
+
+ Assert(IsA(lfirst(l), List));
+
+ /* add item to all grouping sets */
+ set = list_append_unique(set, a);
+ set_list = add_distinct_set(set_list, set);
+ }
+
+ if (gss_b->has_empty_set)
+ {
+ set_list = add_distinct_set(set_list, (list_make1(a)));
+ gss_b->has_empty_set = false;
+ }
+
+ gss_b->set_list = set_list;
+
+ return (Node *) b;
+ }
+
+ /*
+ * ((A,B),(C)) * ((X,Y),()) = ((A,B,X,Y),(A,B),(C,X,Y),(C))
+ */
+ Assert(IsGroupingSetsSpec(a) && IsGroupingSetsSpec(b));
+
+ gss_a = (GroupingSetsSpec *) a;
+ gss_b = (GroupingSetsSpec *) b;
+
+ result = (GroupingSetsSpec *) makeNode(GroupingSetsSpec);
+
+ foreach(l, gss_a->set_list)
+ {
+ foreach(lj, gss_b->set_list)
+ set_list = add_distinct_set(set_list,
+ list_union((List *) lfirst(l),
+ (List *) lfirst(lj)));
+ if (gss_b->has_empty_set)
+ set_list = add_distinct_set(set_list,
+ (List *) lfirst(l));
+ }
+
+ if (gss_a->has_empty_set)
+ foreach(l, gss_b->set_list)
+ set_list = add_distinct_set(set_list, (List *) lfirst(l));
+
+ result->set_list = set_list;
+ result->has_empty_set = gss_a->has_empty_set && gss_b->has_empty_set;
+ result->location = gss_a->location;
+
+ return (Node *) result;
+ }
+
+ /*
+ * multiply elements of list -> (a1,a2,a2,a4) => (((a1 x a2) x a3) x a4)
+ */
+ static Node *
+ set_multiplication(List *list, bool has_empty_set)
+ {
+ int nfields = list_length(list);
+ ListCell *lc;
+ bool first_iter = true;
+ Node *stack = NULL;
+
+ foreach(lc, list)
+ {
+ Node *el = (Node *) lfirst(lc);
+
+ /* Propagate info about empty set to first item */
+ if (first_iter)
+ {
+ if (has_empty_set)
+ {
+ if (IsGroupingSetsSpec(el))
+ {
+ ((GroupingSetsSpec *) el)->has_empty_set = true;
+ /* leave when list has only one field */
+ if (nfields == 1)
+ return el;
+
+ stack = el;
+ }
+ else
+ {
+ GroupingSetsSpec *gss = makeNode(GroupingSetsSpec);
+ gss->set_list = list_make1(list_make1(el));
+ gss->has_empty_set = true;
+ gss->location = exprLocation(el);
+
+ stack = (Node *) gss;
+
+ /* leave when list has only one field */
+ if (nfields == 1)
+ return stack;
+ }
+ }
+ else
+ stack = el;
+
+ first_iter = false;
+ /* go to for next item */
+ continue;
+ }
+
+ Assert(stack != NULL);
+ stack = multiple(stack, el);
+ }
+
+ return stack;
+ }
+
+ /*
+ * This procedure prepare global targetList, local targetLists and local
+ * groupClauses.
+ */
+ List *
+ transformGroupingSetsSpec(ParseState *pstate, List *groupClause,
+ List **groupClauses, List **targetLists,
+ List **targetList, List *sortClause)
+ {
+ List *globalTL = copyObject((Node *)*targetList);
+ List *globalGC = NIL;
+ List *gpcs = NIL;
+ List *tls = NIL;
+ ListCell *gsc;
+ GroupingSetsSpec *gsspec = (GroupingSetsSpec *) linitial(groupClause);
+
+ *groupClauses = gpcs;
+ *targetLists = tls;
+ *targetList = globalTL;
+
+ return globalGC;
+ }
+
*** ./src/backend/parser/parse_target.c.orig 2009-05-04 21:16:40.000000000 +0200
--- ./src/backend/parser/parse_target.c 2009-05-04 21:22:30.000000000 +0200
***************
*** 1446,1451 ****
--- 1446,1467 ----
case T_XmlSerialize:
*name = "xmlserialize";
return 2;
+ case T_GroupingSetsFunc:
+ switch (((GroupingSetsFunc *) node)->identity)
+ {
+ case FUNCTION_GROUPING:
+ *name = "grouping";
+ return 2;
+ case FUNCTION_GROUPING_ID:
+ *name = "grouping_id";
+ return 2;
+ case FUNCTION_CUBE: /* by compiler quite */
+ case FUNCTION_ROLLUP:
+ /* nothing */
+ break;
+ }
+ break;
+
default:
break;
}
*** ./src/include/nodes/nodes.h.orig 2009-05-04 19:32:41.000000000 +0200
--- ./src/include/nodes/nodes.h 2009-05-13 11:33:34.000000000 +0200
***************
*** 376,381 ****
--- 376,385 ----
T_XmlSerialize,
T_WithClause,
T_CommonTableExpr,
+ T_GroupingSetsFunc,
+ T_GroupingSetsSpec,
+ T_EmptySet,
+ T_GroupByClause,
/*
* TAGS FOR RANDOM OTHER STUFF
*** ./src/include/nodes/parsenodes.h.orig 2009-05-04 19:15:28.000000000 +0200
--- ./src/include/nodes/parsenodes.h 2009-05-13 17:46:08.000000000 +0200
***************
*** 374,379 ****
--- 374,406 ----
} SortBy;
/*
+ * GroupingSetsSpec - for GROUP BY GROUPING SETS clause
+ */
+ typedef struct GroupingSetsSpec
+ {
+ NodeTag type;
+ List *set_list;
+ bool has_empty_set; /* true when grouping sets contains empty set */
+ int location;
+ } GroupingSetsSpec;
+
+ typedef struct EmptySet
+ {
+ NodeTag type;
+ } EmptySet;
+
+ /*
+ * GroupByClause for GROUP BY clause
+ */
+ typedef struct GroupByClause
+ {
+ NodeTag type;
+ bool all;
+ List *fields;
+ int location;
+ } GroupByClause;
+
+ /*
* WindowDef - raw representation of WINDOW and OVER clauses
*
* For entries in a WINDOW list, "name" is the window name being defined.
***************
*** 416,421 ****
--- 443,468 ----
FRAMEOPTION_END_CURRENT_ROW)
/*
+ * GroupingSetsFunc - parser node for Grouping, Grouping_id, Cube and Rollup quasy functions
+ */
+ typedef enum GroupingSetsFuncIdentity
+ {
+ FUNCTION_GROUPING,
+ FUNCTION_GROUPING_ID,
+ FUNCTION_CUBE,
+ FUNCTION_ROLLUP
+ } GroupingSetsFuncIdentity;
+
+ typedef struct GroupingSetsFunc
+ {
+ NodeTag type;
+ GroupingSetsFuncIdentity identity;
+ Node *expr;
+ List *expr_list;
+ int location;
+ } GroupingSetsFunc;
+
+ /*
* RangeSubselect - subquery appearing in a FROM clause
*/
typedef struct RangeSubselect
***************
*** 939,945 ****
List *targetList; /* the target list (of ResTarget) */
List *fromClause; /* the FROM clause */
Node *whereClause; /* WHERE qualification */
! List *groupClause; /* GROUP BY clauses */
Node *havingClause; /* HAVING conditional-expression */
List *windowClause; /* WINDOW window_name AS (...), ... */
WithClause *withClause; /* WITH clause */
--- 986,992 ----
List *targetList; /* the target list (of ResTarget) */
List *fromClause; /* the FROM clause */
Node *whereClause; /* WHERE qualification */
! Node *groupClause; /* GROUP BY clauses */
Node *havingClause; /* HAVING conditional-expression */
List *windowClause; /* WINDOW window_name AS (...), ... */
WithClause *withClause; /* WITH clause */
*** ./src/include/parser/kwlist.h.orig 2009-05-04 18:08:32.000000000 +0200
--- ./src/include/parser/kwlist.h 2009-05-05 14:44:39.000000000 +0200
***************
*** 98,103 ****
--- 98,104 ----
PG_KEYWORD("createuser", CREATEUSER, UNRESERVED_KEYWORD)
PG_KEYWORD("cross", CROSS, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("csv", CSV, UNRESERVED_KEYWORD)
+ PG_KEYWORD("cube", CUBE, RESERVED_KEYWORD)
PG_KEYWORD("current", CURRENT_P, UNRESERVED_KEYWORD)
PG_KEYWORD("current_catalog", CURRENT_CATALOG, RESERVED_KEYWORD)
PG_KEYWORD("current_date", CURRENT_DATE, RESERVED_KEYWORD)
***************
*** 168,173 ****
--- 169,176 ----
PG_KEYWORD("granted", GRANTED, UNRESERVED_KEYWORD)
PG_KEYWORD("greatest", GREATEST, COL_NAME_KEYWORD)
PG_KEYWORD("group", GROUP_P, RESERVED_KEYWORD)
+ PG_KEYWORD("grouping", GROUPING, RESERVED_KEYWORD)
+ PG_KEYWORD("grouping_id", GROUPING_ID, COL_NAME_KEYWORD)
PG_KEYWORD("handler", HANDLER, UNRESERVED_KEYWORD)
PG_KEYWORD("having", HAVING, RESERVED_KEYWORD)
PG_KEYWORD("header", HEADER_P, UNRESERVED_KEYWORD)
***************
*** 317,322 ****
--- 320,326 ----
PG_KEYWORD("right", RIGHT, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("role", ROLE, UNRESERVED_KEYWORD)
PG_KEYWORD("rollback", ROLLBACK, UNRESERVED_KEYWORD)
+ PG_KEYWORD("rollup", ROLLUP, RESERVED_KEYWORD)
PG_KEYWORD("row", ROW, COL_NAME_KEYWORD)
PG_KEYWORD("rows", ROWS, UNRESERVED_KEYWORD)
PG_KEYWORD("rule", RULE, UNRESERVED_KEYWORD)
***************
*** 334,339 ****
--- 338,344 ----
PG_KEYWORD("session_user", SESSION_USER, RESERVED_KEYWORD)
PG_KEYWORD("set", SET, UNRESERVED_KEYWORD)
PG_KEYWORD("setof", SETOF, COL_NAME_KEYWORD)
+ PG_KEYWORD("sets", SETS, UNRESERVED_KEYWORD)
PG_KEYWORD("share", SHARE, UNRESERVED_KEYWORD)
PG_KEYWORD("show", SHOW, UNRESERVED_KEYWORD)
PG_KEYWORD("similar", SIMILAR, TYPE_FUNC_NAME_KEYWORD)
*** ./src/include/parser/parse_gsets.h.orig 2009-05-04 21:41:52.000000000 +0200
--- ./src/include/parser/parse_gsets.h 2009-05-11 11:22:57.000000000 +0200
***************
*** 0 ****
--- 1,14 ----
+ #include "parser/parse_node.h"
+ #include "nodes/pg_list.h"
+
+
+ #ifndef PARSE_GETS_H
+ #define PARSE_GETS_H
+
+ Node *expandGroupingSets(ParseState *pstate, List *grouplist);
+ List *transformGroupingSetsSpec(ParseState *pstate, List *groupClause,
+ List **groupClauses, List **targetLists,
+ List **targetList, List *sortClause);
+
+
+ #endif /* PARSE_GETS_H */
*** ./src/include/parser/parse_node.h.orig 2009-05-11 20:31:38.000000000 +0200
--- ./src/include/parser/parse_node.h 2009-05-11 20:32:05.000000000 +0200
***************
*** 101,106 ****
--- 101,107 ----
bool p_is_insert;
bool p_is_update;
Relation p_target_relation;
+ bool p_hasGroupingSets;
RangeTblEntry *p_target_rangetblentry;
} ParseState;
Олег Царев escribió:
Hello all!
If no one objecte (all agree, in other say) i continue work on patch -
particulary, i want support second strategy (tuple store instead of
hash-table) for save order of source (more cheap solution in case with
grouping sets + order by), investigate and brainstorm another
optimisation, writing regression tests and technical documentation.
But I need some time for complete my investigation internals of
PostgreSQL, particulary CTE.
Where are we on this patch? Is it moving forward?
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2009/8/8 Alvaro Herrera <alvherre@commandprompt.com>:
Олег Царев escribió:
Hello all!
If no one objecte (all agree, in other say) i continue work on patch -
particulary, i want support second strategy (tuple store instead of
hash-table) for save order of source (more cheap solution in case with
grouping sets + order by), investigate and brainstorm another
optimisation, writing regression tests and technical documentation.
But I need some time for complete my investigation internals of
PostgreSQL, particulary CTE.Where are we on this patch? Is it moving forward?
It seems to me that the patch goes backward.
I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.
If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.
Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.
When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:
select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by b
so nothing better than query rewriting unless we invent something new.
But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.
Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?
Regards,
--
Hitoshi Harada
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/8 Alvaro Herrera <alvherre@commandprompt.com>:
Олег Царев escribió:
Hello all!
If no one objecte (all agree, in other say) i continue work on patch -
particulary, i want support second strategy (tuple store instead of
hash-table) for save order of source (more cheap solution in case with
grouping sets + order by), investigate and brainstorm another
optimisation, writing regression tests and technical documentation.
But I need some time for complete my investigation internals of
PostgreSQL, particulary CTE.Where are we on this patch? Is it moving forward?
It seems to me that the patch goes backward.
I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by bso nothing better than query rewriting unless we invent something new.
But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?Regards,
--
Hitoshi Harada
All rights, exclude
Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.
because group by it's optimized version of grouping sets.
Of course, we can extend the current definition of group by, but we
regress perfomance of it.
Some questions for you:
How calcualte aggregation on ROLLUP on single pass?
Stupid way - store different buffer of aggregations for every group,
and accumulate every record on group for every calculator. When a
group has changed, return key of this group to output set with NULL
for fields not contains in this group, and restart current buffer of
aggregation.
Better way - add operation "merge aggregations", and calculate one
buffer on every group, when group has cnahged - merge this "main
buffer" to other, and return some intermediate result.
I think, support this of grouping operation isn't simple, and
different implementation of ROLLUP it's better.
Regards, Tsarev Oleg
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/8 Alvaro Herrera <alvherre@commandprompt.com>:
Олег Царев escribió:
Hello all!
If no one objecte (all agree, in other say) i continue work on patch -
particulary, i want support second strategy (tuple store instead of
hash-table) for save order of source (more cheap solution in case with
grouping sets + order by), investigate and brainstorm another
optimisation, writing regression tests and technical documentation.
But I need some time for complete my investigation internals of
PostgreSQL, particulary CTE.Where are we on this patch? Is it moving forward?
It seems to me that the patch goes backward.
little bit.
I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.
I wrote older version in time when CTE wasn't implemented. The old
patch had own executor node, and was based on creating hash table per
group. This patch was maybe little bit faster than 0.6, but had lot of
bugs. Who knows planner code for grouping, then have to agree with me.
This code isn't readable and I wouldn't to it more complicated and
less readable. So I had idea, to join grouping sets with CTE. Grouping
sets is subset of CTE, so it is possible. Grouping sets is non
recursive CTE generally, and (I believe) this should be optimized
together.
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.
If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.
I don't thing, so you can do simply transformation from grouping sets
syntax to CTE. And what's more. Why you have optimized grouping sets
and not optimized non recursive CTE?
Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.
This way is possible too. But needs absolutely grouping planner and
executor reworking. Maybe is time do it. It is work for somebody other
to me. My place is stored procedures.
When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by bso nothing better than query rewriting unless we invent something new.
the problem is when x is subquery. Then is better using CTE, because
we don't need repeat x evaluation twice. The most typical use case is,
so x isn't table.
But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.
Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?
I thing, so both ways are possible. Probably the most clean way is
total refactoring of grouping executor and grouping planner. I am not
sure if we need new nodes. There are all. But these nodes cannot work
paralel now.
Show quoted text
Regards,
--
Hitoshi Harada
2009/8/14 Олег Царев <zabivator@gmail.com>:
All rights, exclude
Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.because group by it's optimized version of grouping sets.
Of course, we can extend the current definition of group by, but we
regress perfomance of it.
Some questions for you:How calcualte aggregation on ROLLUP on single pass?
I'd imagine such like:
select a, b, count(*) from x group by rollup(a, b);
PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
while(row = fetch()){
if(group_is_changed(ab, row)){
result_ab = finalize_agg(ab);
ab = init_agg();
}
if(group_is_changed(a, row)){
result_a = finalize_agg(a);
a = init_agg();
}
advance_agg(all, row);
advance_agg(a, row);
advance_agg(ab, row);
}
result_all = finalize_agg(all);
of course you should care best way to return result row and continue
aggregates and the number of grouping key varies from 1 to many, it is
quite possible. And normal GROUP BY is a case of key = a only, there
won't be performance regression.
Better way - add operation "merge aggregations", and calculate one
buffer on every group, when group has cnahged - merge this "main
buffer" to other, and return some intermediate result.
"Merge aggregates" sounds fascinating to me in not only this feature
but also partitioned table aggregates. But adding another function
("merge function?") to the current aggregate system is quite far way.
I think, support this of grouping operation isn't simple, and
different implementation of ROLLUP it's better.
Surely not simple. Adding another node is one of the choices, but from
code maintenance point of view I feel it is better to integrate it
into nodeAgg. nodeWindowAgg and nodeAgg have similar aggregate
processes but don't share it so a bug fix in nodeAgg isn't completed
in itself but we must re-check nodeWindowAgg also. To add another
agg-like node *may* be kind of nightmare.
Regards,
--
Hitoshi Harada
2009/8/13 Олег Царев <zabivator@gmail.com>:
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/8 Alvaro Herrera <alvherre@commandprompt.com>:
Олег Царев escribió:
Hello all!
If no one objecte (all agree, in other say) i continue work on patch -
particulary, i want support second strategy (tuple store instead of
hash-table) for save order of source (more cheap solution in case with
grouping sets + order by), investigate and brainstorm another
optimisation, writing regression tests and technical documentation.
But I need some time for complete my investigation internals of
PostgreSQL, particulary CTE.Where are we on this patch? Is it moving forward?
It seems to me that the patch goes backward.
I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by bso nothing better than query rewriting unless we invent something new.
But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?Regards,
--
Hitoshi HaradaAll rights, exclude
Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.because group by it's optimized version of grouping sets.
Of course, we can extend the current definition of group by, but we
regress perfomance of it.
Some questions for you:How calcualte aggregation on ROLLUP on single pass?
Stupid way - store different buffer of aggregations for every group,
and accumulate every record on group for every calculator. When a
group has changed, return key of this group to output set with NULL
for fields not contains in this group, and restart current buffer of
aggregation.
Better way - add operation "merge aggregations", and calculate one
buffer on every group, when group has cnahged - merge this "main
buffer" to other, and return some intermediate result.
I don't thing, so this is possible for all operations. Don't forgot.
People can to implement own aggregates. example: weighted average
regards
Pavel Stehule
Show quoted text
I think, support this of grouping operation isn't simple, and
different implementation of ROLLUP it's better.Regards, Tsarev Oleg
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.
You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.
When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by bso nothing better than query rewriting unless we invent something new.
the problem is when x is subquery. Then is better using CTE, because
we don't need repeat x evaluation twice. The most typical use case is,
so x isn't table.
So we need single scan aggregate as far as possible. Buffering
subquery's result is possible without CTE node. Tuplestore has that
functionality but I found the buffered result will be sorted multiple
times, one way might be to allow tuplesort to perform sort multiple
times with different keys.
Regards,
--
Hitoshi Harada
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.
I agree. Simply I am not have time and force do it. I would to
concentrate on finishing some plpgsql issues, and then I have to do
some other things than PostgreSQL. There are fully functional
prototype and everybody is welcome to continue in this work.
When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by bso nothing better than query rewriting unless we invent something new.
the problem is when x is subquery. Then is better using CTE, because
we don't need repeat x evaluation twice. The most typical use case is,
so x isn't table.So we need single scan aggregate as far as possible. Buffering
subquery's result is possible without CTE node. Tuplestore has that
functionality but I found the buffered result will be sorted multiple
times, one way might be to allow tuplesort to perform sort multiple
times with different keys.
yes, I don't afraid multiple evaluation of aggregates. It's cheap.
Problem is multiple table scan. I though about some new version of
aggregates. Current aggregates process row by row with final
operation. Some new kind of aggregates should to work over tuple
store. Internally it get pointer to tupplestore and number of rows.
This should be very fast for functions like median, or array_agg. I
thing so it's similar to window functions - only it not window
function.
If you like to optimalize to speed, then the most faster solution will
be using hash tables - then you don't need tuplestore. For rollup is
possible maybe one single scan - but I am not sure - there are
important fakt - final function cannot modify intermediate data.
Pavel
Show quoted text
Regards,
--
Hitoshi Harada
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.I agree. Simply I am not have time and force do it. I would to
concentrate on finishing some plpgsql issues, and then I have to do
some other things than PostgreSQL. There are fully functional
prototype and everybody is welcome to continue in this work.
I see your situation. Actually your prototype is good shape to be
discussed in both ways. But since you've been focusing on this feature
it'd be better if you keep your eyes on this.
So, Oleg, do you continue on this?
Regards,
--
Hitoshi Harada
2009/8/14 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.I agree. Simply I am not have time and force do it. I would to
concentrate on finishing some plpgsql issues, and then I have to do
some other things than PostgreSQL. There are fully functional
prototype and everybody is welcome to continue in this work.I see your situation. Actually your prototype is good shape to be
discussed in both ways. But since you've been focusing on this feature
it'd be better if you keep your eyes on this.So, Oleg, do you continue on this?
Regards,
--
Hitoshi Harada
I'd imagine such like:
select a, b, count(*) from x group by rollup(a, b);
PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
while(row = fetch()){
if(group_is_changed(ab, row)){
result_ab = finalize_agg(ab);
ab = init_agg();
}
if(group_is_changed(a, row)){
result_a = finalize_agg(a);
a = init_agg();
}
advance_agg(all, row);
advance_agg(a, row);
advance_agg(ab, row);
}
result_all = finalize_agg(all);
Fun =) My implementation of rollup in DBMS qd work as your imagine there! =)
Also, multiply sort of source we take for CUBE implementation, but
this hard for support (sort in group by - it's bloat).
As result we have merge implementation of group by, rollup, and window
functions with some common code - it's way for grouping of source,
Hash implementation group xxx on different hash-tables (with different
keys) it's very expensive (require many memory for keys).
I hope continue my work, after end of time trouble on work =( (bad
TPC-H perfomance)
2009/8/14 Олег Царев <zabivator@gmail.com>:
2009/8/14 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.I agree. Simply I am not have time and force do it. I would to
concentrate on finishing some plpgsql issues, and then I have to do
some other things than PostgreSQL. There are fully functional
prototype and everybody is welcome to continue in this work.I see your situation. Actually your prototype is good shape to be
discussed in both ways. But since you've been focusing on this feature
it'd be better if you keep your eyes on this.So, Oleg, do you continue on this?
Regards,
--
Hitoshi HaradaI'd imagine such like:
select a, b, count(*) from x group by rollup(a, b);
PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
while(row = fetch()){
if(group_is_changed(ab, row)){
result_ab = finalize_agg(ab);
ab = init_agg();
}
if(group_is_changed(a, row)){
result_a = finalize_agg(a);
a = init_agg();
}
advance_agg(all, row);
advance_agg(a, row);
advance_agg(ab, row);
}
result_all = finalize_agg(all);Fun =) My implementation of rollup in DBMS qd work as your imagine there! =)
Also, multiply sort of source we take for CUBE implementation, but
this hard for support (sort in group by - it's bloat).
As result we have merge implementation of group by, rollup, and window
functions with some common code - it's way for grouping of source,
Hash implementation group xxx on different hash-tables (with different
keys) it's very expensive (require many memory for keys).
I hope continue my work, after end of time trouble on work =( (bad
TPC-H perfomance)
I thing, so you are afraid too much about memory. Look on current
postgres. Any hash grouping is faster than sort grouping. Try and see.
PostgreSQL isn't embeded database. So there are not main goal an using
less memory. The goal is has features with clean, readable and
maintainable source code.
After week-lengthed investigation, now i 'm sure - my level of
qualification not enough for implementation task "GROUPING SETS".
I require documentation about the executor and the planner, i can't
understand scheme of work by source code.
Many code, many cases, but very little information "what is it" and
"how thos work". May be i stupid.
Sorry.
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
Show quoted text
2009/8/14 Олег Царев <zabivator@gmail.com>:
2009/8/14 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.I agree. Simply I am not have time and force do it. I would to
concentrate on finishing some plpgsql issues, and then I have to do
some other things than PostgreSQL. There are fully functional
prototype and everybody is welcome to continue in this work.I see your situation. Actually your prototype is good shape to be
discussed in both ways. But since you've been focusing on this feature
it'd be better if you keep your eyes on this.So, Oleg, do you continue on this?
Regards,
--
Hitoshi HaradaI'd imagine such like:
select a, b, count(*) from x group by rollup(a, b);
PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
while(row = fetch()){
if(group_is_changed(ab, row)){
result_ab = finalize_agg(ab);
ab = init_agg();
}
if(group_is_changed(a, row)){
result_a = finalize_agg(a);
a = init_agg();
}
advance_agg(all, row);
advance_agg(a, row);
advance_agg(ab, row);
}
result_all = finalize_agg(all);Fun =) My implementation of rollup in DBMS qd work as your imagine there! =)
Also, multiply sort of source we take for CUBE implementation, but
this hard for support (sort in group by - it's bloat).
As result we have merge implementation of group by, rollup, and window
functions with some common code - it's way for grouping of source,
Hash implementation group xxx on different hash-tables (with different
keys) it's very expensive (require many memory for keys).
I hope continue my work, after end of time trouble on work =( (bad
TPC-H perfomance)I thing, so you are afraid too much about memory. Look on current
postgres. Any hash grouping is faster than sort grouping. Try and see.
PostgreSQL isn't embeded database. So there are not main goal an using
less memory. The goal is has features with clean, readable and
maintainable source code.
On Thu, Sep 03, 2009 at 01:19:25AM +0400, Олег Царев wrote:
After week-lengthed investigation, now i 'm sure - my level of
qualification not enough for implementation task "GROUPING SETS".
I require documentation about the executor and the planner, i can't
understand scheme of work by source code.
Many code, many cases, but very little information "what is it" and
"how thos work". May be i stupid.
I doubt you're stupid; a stupid person wouldn't know what GROUPING SETS meant,
wouldn't bother finding out, and certainly wouldn't bother trying to implement
it. It's very helpful that you've let us know you're not working on it. That
way Pavel, if he finds he has time and interest, or someone else, can work on
it without fear of conflicting with what you're doing. Thanks for your work;
please don't get discouraged!
--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com
2009/9/2 Олег Царев <zabivator@gmail.com>:
After week-lengthed investigation, now i 'm sure - my level of
qualification not enough for implementation task "GROUPING SETS".
I require documentation about the executor and the planner, i can't
understand scheme of work by source code.
Many code, many cases, but very little information "what is it" and
"how thos work". May be i stupid.
GROUPING SETS is too hard begin. Grouping planner isn't really
readable code. It is reason, whay I did GROUPING SETS over CTE and why
I would to share code with CTE.
regards
Pavel
Show quoted text
Sorry.
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
2009/8/14 Олег Царев <zabivator@gmail.com>:
2009/8/14 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
2009/8/13 Hitoshi Harada <umi.tanuki@gmail.com>:
2009/8/14 Pavel Stehule <pavel.stehule@gmail.com>:
I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.I agree. Simply I am not have time and force do it. I would to
concentrate on finishing some plpgsql issues, and then I have to do
some other things than PostgreSQL. There are fully functional
prototype and everybody is welcome to continue in this work.I see your situation. Actually your prototype is good shape to be
discussed in both ways. But since you've been focusing on this feature
it'd be better if you keep your eyes on this.So, Oleg, do you continue on this?
Regards,
--
Hitoshi HaradaI'd imagine such like:
select a, b, count(*) from x group by rollup(a, b);
PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
while(row = fetch()){
if(group_is_changed(ab, row)){
result_ab = finalize_agg(ab);
ab = init_agg();
}
if(group_is_changed(a, row)){
result_a = finalize_agg(a);
a = init_agg();
}
advance_agg(all, row);
advance_agg(a, row);
advance_agg(ab, row);
}
result_all = finalize_agg(all);Fun =) My implementation of rollup in DBMS qd work as your imagine there! =)
Also, multiply sort of source we take for CUBE implementation, but
this hard for support (sort in group by - it's bloat).
As result we have merge implementation of group by, rollup, and window
functions with some common code - it's way for grouping of source,
Hash implementation group xxx on different hash-tables (with different
keys) it's very expensive (require many memory for keys).
I hope continue my work, after end of time trouble on work =( (bad
TPC-H perfomance)I thing, so you are afraid too much about memory. Look on current
postgres. Any hash grouping is faster than sort grouping. Try and see.
PostgreSQL isn't embeded database. So there are not main goal an using
less memory. The goal is has features with clean, readable and
maintainable source code.
2009/9/3 Joshua Tolley <eggyknap@gmail.com>:
On Thu, Sep 03, 2009 at 01:19:25AM +0400, Олег Царев wrote:
After week-lengthed investigation, now i 'm sure - my level of
qualification not enough for implementation task "GROUPING SETS".
I require documentation about the executor and the planner, i can't
understand scheme of work by source code.
Many code, many cases, but very little information "what is it" and
"how thos work". May be i stupid.I doubt you're stupid; a stupid person wouldn't know what GROUPING SETS meant,
wouldn't bother finding out, and certainly wouldn't bother trying to implement
it. It's very helpful that you've let us know you're not working on it. That
way Pavel, if he finds he has time and interest, or someone else, can work on
it without fear of conflicting with what you're doing. Thanks for your work;
please don't get discouraged!
There some ways, how implement GROUPING SETS. Currently I don't would
to continue on this topic without some sponsoring. It got too my time
- and I am able to implement it only as some special variant of CTE.
Second way is total grouping planner refactoring - what is out of me.
regard
Pavel
Show quoted text
--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)iEYEARECAAYFAkqfM2QACgkQRiRfCGf1UMOHvgCgpzV9cvjhCWhzcmvRDbXjdBQ1
4RYAn2E+ZLRLdho+c+ZFleslPrbyxsZN
=66vh
-----END PGP SIGNATURE-----
2009/9/3 Pavel Stehule <pavel.stehule@gmail.com>:
2009/9/3 Joshua Tolley <eggyknap@gmail.com>:
On Thu, Sep 03, 2009 at 01:19:25AM +0400, Олег Царев wrote:
After week-lengthed investigation, now i 'm sure - my level of
qualification not enough for implementation task "GROUPING SETS".
I require documentation about the executor and the planner, i can't
understand scheme of work by source code.
Many code, many cases, but very little information "what is it" and
"how thos work". May be i stupid.I doubt you're stupid; a stupid person wouldn't know what GROUPING SETS meant,
wouldn't bother finding out, and certainly wouldn't bother trying to implement
it. It's very helpful that you've let us know you're not working on it. That
way Pavel, if he finds he has time and interest, or someone else, can work on
it without fear of conflicting with what you're doing. Thanks for your work;
please don't get discouraged!There some ways, how implement GROUPING SETS. Currently I don't would
to continue on this topic without some sponsoring. It got too my time
- and I am able to implement it only as some special variant of CTE.
Second way is total grouping planner refactoring - what is out of me.
There supposed to be another way to implement, that is a "hybrid" way
of CTE approach and hash tables approach. As you noticed, in group
mode the planner may be untouchable which CTE helps to avoid and in
hash mode you've almost done it. I wouldn't like to agree to CTE based
approach totally, but the hybrid might be reasonable.
Regards,
regard
Pavel--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)iEYEARECAAYFAkqfM2QACgkQRiRfCGf1UMOHvgCgpzV9cvjhCWhzcmvRDbXjdBQ1
4RYAn2E+ZLRLdho+c+ZFleslPrbyxsZN
=66vh
-----END PGP SIGNATURE-----
--
Hitoshi Harada