Re: Having Patch (against v6.3.2)

Started by Bruce Momjianalmost 28 years ago3 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

I have applied this fine patch from Stephan.

It fixes many problems with Having, and some other problems that exist.

Vadim, can you check on the psort_end() issue, and see where that should
go. I am lost in that area.

[Charset ISO-8859-1 unsupported, filtering to ASCII...]

Hi out there!

I think I have good news for you: Having is working now! The patch is
included in this email (tarred, gnu-zipped and finally uuencoded).

But before you go on unpacking the patch let me tell you about the
following items:

1) Queries using the having clause on base tables should work well
now. Here some tested features, (examples included in the patch):

1.1) Subselects in the having clause
1.2) Double nested subselects
1.3) Subselects used in the where clause and in the having clause
simultaneously
1.4) Union Selects using having
1.5) Indexes on the base relations are used correctly
1.6) Unallowed Queries are prevented (e.g. qualifications in the
having clause that belong to the where clause)
1.7) Insert into as select

2) Queries using the having clause on view relations also work
but there are some restrictions:

2.1) Create View as Select ... Having ...; using base tables in the select
does work *BUT*: only simple queries are allowed to this new
created view. This is *not* because of the having
logic but on the technique used to implement
views in postgreSQL, the query rewrite system.

2.1.1) The Query rewrite system:
As you know, postgreSQL uses a query rewrite system to
implement views. It does so by storing the query used to define
the view somewhere in the system catalogs. If a user makes a
query against the view the system "rewrites" the user query by
merging it with the stored query (used to define the view). The
new "rewritten" query is optimized, planned, etc and executed
against the base tables from the view definition query.

2.1.2) Why are only simple queries allowed against a view from 2.1) ?
The problem with the technique described in 2.1.1) is, that it
is unfortunately not possible to merge any two SQL queries in
in such a way that the result will behave as expected:
consider the following view definition:

create view testview as
select pid, sid
from part
where pid=5
group by pid;

and the following query:

select max(pid), sid
from testview
where sid = 100
group by sid;

The query rewrite system will produce something like:

select max(pid), sid
from part
where pid=5 AND sid = 100 /* no problem here */
group by ??? /* which attribute(s) to group by?? */

You see, if the view definition and the query both contain
a group clause, we will run into troubles.

The solution to this would be the implementation of subselects
in the from clause, then the rewrite system would produce:

select max(pid), sid
from (select pid, sid
from part
where pid=5
group by pid)
where sid = 100
group by sid;

2.2) Select ... from testview1, testview2, ... having...;
does also work, as long as the views used are simple
row/column subsets of the baserelations used. (No group clauses
in the view definitons)

3) Bug in ExecMergeJoin ??
This is something that has *NOTHING* to do with the Having logic!
Proof: Try the following query (without having my patch applied):

select s.sid, s.sname, s.city
from supplier s
where s.sid+10 in (select se1.pid
from supplier s1, sells se1, part p1
where s1.sid=se1.sid and
s1.sname=s.sname and
se1.pid=p1.pid);

(The file 'queries/create_insert.sql' included in the patch contains the
data for this, the query is included in 'queries/having.sql' !)

As you can see, there is neither a having qual nor an aggregate
function used in the above query an you will see, it will fail!

I found out that the reason for this is in the function
'ExecMergeJoin()' in
switch (mergestate->mj_JoinState)
{
....
case EXEC_MJ_NEXTOUTER:
....
CleanUpSort(node->join.lefttree->lefttree);
CleanUpSort(node->join.righttree->lefttree);
....
}

In 'CleanUpSort()' the function 'psort_end()' gets called and
closes down the sort, which is correct as long as no subselects
are in use!

I found out, that the bug does not appear when I comment the call
to 'psort_end()' out in 'CleanUpSort()'.

I heavily tested the system after that and things worked well but
maybe this is just a lucky chance.

So please, if anybody who has good knowledge of that part of the
code could have a look at that it would be great!

I am sure the sort has to be ended at some time calling 'psort_end()'
but I did not have enough time to look for the right place. I was
just happy about the fact it produced some correct results and
stopped working on that.

4) Test Examples included:
in the patch there is a directory 'queries' containing the
following files:
create_insert.sql to create the test relations and views
destroy.sql to drop the test relations and views
having.sql the test queries on base relations
view_having.sql the test queries on / defining views

5) The patch is against the original v6.3.2 and can be applied
by:
cd ..../postgresql-6.3.2/
patch [-p2] < having_6.3.2.diff

Regards Stefan

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#2Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#1)
Re: [HACKERS] Re: Having Patch (against v6.3.2)

3) Bug in ExecMergeJoin ??
This is something that has *NOTHING* to do with the Having logic!
Proof: Try the following query (without having my patch applied):

select s.sid, s.sname, s.city
from supplier s
where s.sid+10 in (select se1.pid
from supplier s1, sells se1, part p1
where s1.sid=se1.sid and
s1.sname=s.sname and
se1.pid=p1.pid);

(The file 'queries/create_insert.sql' included in the patch contains the
data for this, the query is included in 'queries/having.sql' !)

As you can see, there is neither a having qual nor an aggregate
function used in the above query an you will see, it will fail!

I found out that the reason for this is in the function
'ExecMergeJoin()' in
switch (mergestate->mj_JoinState)
{
....
case EXEC_MJ_NEXTOUTER:
....
CleanUpSort(node->join.lefttree->lefttree);
CleanUpSort(node->join.righttree->lefttree);
....
}

In 'CleanUpSort()' the function 'psort_end()' gets called and
closes down the sort, which is correct as long as no subselects
are in use!

I found out, that the bug does not appear when I comment the call
to 'psort_end()' out in 'CleanUpSort()'.

I heavily tested the system after that and things worked well but
maybe this is just a lucky chance.

So please, if anybody who has good knowledge of that part of the
code could have a look at that it would be great!

I am sure the sort has to be ended at some time calling 'psort_end()'
but I did not have enough time to look for the right place. I was
just happy about the fact it produced some correct results and
stopped working on that.

I have looked into this, and it appears you are correct. The
psort_end() gets called with the T_Sort node is closed. Why they are
trying to close it in the Merge makes no sense to me. The psort calling
code in the executor was cleaned up around 6.2 because there were some
strange blocks of code in this section. Perhaps this is another area
where the code was called when it should not have been.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#3Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#1)
Re: [HACKERS] Re: Having Patch (against v6.3.2)

3) Bug in ExecMergeJoin ??
This is something that has *NOTHING* to do with the Having logic!
Proof: Try the following query (without having my patch applied):

....
CleanUpSort(node->join.lefttree->lefttree);
CleanUpSort(node->join.righttree->lefttree);
....
}

In 'CleanUpSort()' the function 'psort_end()' gets called and
closes down the sort, which is correct as long as no subselects
are in use!

I looked at the Mariposa code, which has some fixes, and they have
removed the call to CleanUpSort in mergejoin, so that verifies the fix
is correct. I am removing the entire CleanUpSort function and calls from
mergejoin.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)