possible optimizations - pushing filter before aggregation

Started by Pavel Stehuleabout 9 years ago6 messages
#1Pavel Stehule
pavel.stehule@gmail.com

Hi

In one application I see slow queries. There is often used views like

CREATE VIEW v AS SELECT min(a) M1, min(b) M2, max(c) M3, x, y, z
FROM t1 GROUP BY x, y, z;

and queries like

SELECT * FROM v
WHERE M2 = const1
AND M3 > const2

Isn't possible in this case push equivalence before aggregation?

Regards

Pavel

#2Douglas Doole
dougdoole@gmail.com
In reply to: Pavel Stehule (#1)
Re: possible optimizations - pushing filter before aggregation

On Fri, Nov 18, 2016 at 12:47 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Isn't possible in this case push equivalence before aggregation?

If I'm understanding you correctly, that would lead to wrong results.
Here's a simple example:

CREATE VIEW v AS SELECT MIN(a) m FROM t;

and table T contains:

T:A
---
1
2
3

SELECT * FROM v WHERE m = 2

The minimum value of A is 1, so the query should return no rows.

However, if we filter first we'd be effectively doing the query:

SELECT MIN(a) m FROM
(SELECT a FROM t WHERE a=2) AS v(a)

The subquery is going to return an intermediate result of:

V:A
---
2

And the minimum of that is 2, which is the wrong answer.

- Doug
Salesforce

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Douglas Doole (#2)
Re: possible optimizations - pushing filter before aggregation

2016-11-19 3:59 GMT+01:00 Douglas Doole <dougdoole@gmail.com>:

On Fri, Nov 18, 2016 at 12:47 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Isn't possible in this case push equivalence before aggregation?

If I'm understanding you correctly, that would lead to wrong results.
Here's a simple example:

CREATE VIEW v AS SELECT MIN(a) m FROM t;

and table T contains:

T:A
---
1
2
3

SELECT * FROM v WHERE m = 2

The minimum value of A is 1, so the query should return no rows.

However, if we filter first we'd be effectively doing the query:

SELECT MIN(a) m FROM
(SELECT a FROM t WHERE a=2) AS v(a)

The subquery is going to return an intermediate result of:

V:A
---
2

And the minimum of that is 2, which is the wrong answer.

yes, you have true,

thank you for correcting

Regards

Pavel

Show quoted text

- Doug
Salesforce

#4Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Pavel Stehule (#3)
Re: possible optimizations - pushing filter before aggregation

On Sat, Nov 19, 2016 at 8:59 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

SELECT MIN(a) m FROM
(SELECT a FROM t WHERE a=2) AS v(a)

The subquery is going to return an intermediate result of:

V:A
---
2

And the minimum of that is 2, which is the wrong answer.

yes, you have true,

In above case wondering if we could do this

Min(a) = 2 is the condition, generate condition *"a <= 2"* and push it down
as scan key. Since pushed down condition is lossy one for us ( it gives
values < 2), finally do a recheck of *"Min(a) = 2"*.

For Max(a) = 2 we can have *"a >=2"*,

If both are given we can combine them appropriately.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#5Douglas Doole
dougdoole@gmail.com
In reply to: Mithun Cy (#4)
Re: possible optimizations - pushing filter before aggregation

In above case wondering if we could do this

Min(a) = 2 is the condition, generate condition *"a <= 2"* and push it
down as scan key. Since pushed down condition is lossy one for us ( it
gives values < 2), finally do a recheck of *"Min(a) = 2"*.

For Max(a) = 2 we can have *"a >=2"*,

After replying, I was thinking along these lines too. I can't see any
reason why it wouldn't work. The same would apply for HAVING clauses on
min/max aggregations as well.

For min, you should be able to pre-filter =, < , and <=. In all cases the
pre-filter would be <=. For max it would be =, > , >= becoming >=.

- Doug Doole
Salesforce

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Douglas Doole (#5)
Re: possible optimizations - pushing filter before aggregation

Douglas Doole <dougdoole@gmail.com> writes:

For min, you should be able to pre-filter =, < , and <=. In all cases the
pre-filter would be <=. For max it would be =, > , >= becoming >=.

Doesn't really seem worth the trouble to me, given that those are pretty
unselective filter conditions. If you could push down an = then it might
be worth doing ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers