order of clauses

Started by Patrick Welcheabout 25 years ago9 messagesgeneral
Jump to latest
#1Patrick Welche
prlw1@newn.cam.ac.uk

create table vals (
x float,
y float
);
insert into vals values (2,4);
insert into vals values (2,2);
insert into vals values (2,1);
insert into vals values (2,0);
select x/y from vals where y>0 and x/y>1;

will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in
any order (A and B = B and A). I obviously would like (y>0) to happen first,
but I don't see how this can be achieved.. Any ideas?

Cheers,

Patrick

#2Michael Fork
mfork@toledolink.com
In reply to: Patrick Welche (#1)
Re: order of clauses

You didn't mention what version of Postgres, but in 7.1beta, you could do
the following (pretty sure on the syntax):

SELECT a.x/b.y FROM vals a, (SELECT y FROM vals WHERE y > 0) b WHERE (a.x
/ b.y) > 1;

In anything else, you could try a view:

CREATE VIEW valid_vals AS SELECT y FROM vals WHERE y > 0;
SELECT a.x/b.y FROM vals a, valid_vals b WHERE (a.x
/ b.y) > 1

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

On Wed, 14 Feb 2001, Patrick Welche wrote:

Show quoted text

create table vals (
x float,
y float
);
insert into vals values (2,4);
insert into vals values (2,2);
insert into vals values (2,1);
insert into vals values (2,0);
select x/y from vals where y>0 and x/y>1;

will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in
any order (A and B = B and A). I obviously would like (y>0) to happen first,
but I don't see how this can be achieved.. Any ideas?

Cheers,

Patrick

#3Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Patrick Welche (#1)
Re: order of clauses

Well, it doesn't solve the ordering question, but you could
use a where something like this I guess:
where y>0 and (x/(case when y=0 then 1 else y end))>1

On Wed, 14 Feb 2001, Patrick Welche wrote:

Show quoted text

create table vals (
x float,
y float
);
insert into vals values (2,4);
insert into vals values (2,2);
insert into vals values (2,1);
insert into vals values (2,0);
select x/y from vals where y>0 and x/y>1;

will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in
any order (A and B = B and A). I obviously would like (y>0) to happen first,
but I don't see how this can be achieved.. Any ideas?

#4Dan Wilson
phpPgAdmin@acucore.com
In reply to: Michael Fork (#2)
Re: order of clauses

: SELECT a.x/b.y FROM vals a, (SELECT y FROM vals WHERE y > 0) b WHERE (a.x
: / b.y) > 1;

How much of a performance hit is there when using a select in the FROM
clause? Is it even noticeable? How much better is it to create a static
view?

-Dan

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Patrick Welche (#1)
Re: order of clauses

Patrick Welche <prlw1@newn.cam.ac.uk> writes:

select x/y from vals where y>0 and x/y>1;

will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in
any order (A and B = B and A). I obviously would like (y>0) to happen first,
but I don't see how this can be achieved.. Any ideas?

Of course you can rewrite this particular case to avoid the division,
but I suppose you are looking for a more general answer.
Consider something like

CASE WHEN y > 0 THEN x/y > 1 ELSE false END

I think that right now, the planner gratuitously reverses the order of
the WHERE clauses that it's unable to convert to index/join quals, thus
your failure. So you could hack around the problem just by switching
the two conditions. I've been meaning to try to figure out where the
reversal is happening and undo it, however, so this behavior should not
be considered to be documented/supported/guaranteed.

regards, tom lane

#6Steve Wolfe
steve@iboats.com
In reply to: Patrick Welche (#1)
Re: order of clauses

will give a divide by zero error as A=(y>0) and B=(x/y>1) can be

evaluated in

any order (A and B = B and A). I obviously would like (y>0) to happen

first,

but I don't see how this can be achieved.. Any ideas?

I have one idea that would be nifty to implement. In some compilers, you
can turn off complete boolean checking. As soon as any part of an
expression will invalidate the expression, it stops evaluating all of it.
That can help you avoid division by zero, and keeps you from evaluating
parts of the expression that don't matter. It sounds like a good idea, at
least to an ignoramus like me. : )

steve

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Dan Wilson (#4)
Re: order of clauses

Dan Wilson writes:

: SELECT a.x/b.y FROM vals a, (SELECT y FROM vals WHERE y > 0) b WHERE (a.x
: / b.y) > 1;

How much of a performance hit is there when using a select in the FROM
clause? Is it even noticeable? How much better is it to create a static
view?

Subselects in FROM are currently not the most performance-encouraging way
to write a query (explicit JOINs might use more efficient plans), but
setting up a view is going to buy you zero because a query from a view is
just going to end up being processed like a subselect in FROM.

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

#8Mitch Vincent
mitch@venux.net
In reply to: Patrick Welche (#1)
Re: order of clauses

Are you referring to short circuit? That's a language feature, isn't it? I
didn't think it had anything to do with the compiler (I know C and a few
other languages do it). Anyway, I could be wrong.. Seems that could break a
lot of code if the programmer relies on short circuit in some conditional
statements.

if ( whatever() OR something() ) {

blah();

}

-- if "whatever" evaluates to true, then "something" isn't executed (the
whole statement is true if one is true)...

This really only comes into play when you're comparing the values returned
by something (a method, function, etc), if you're just looking at boolean
variable I guess it doesn't matter.

-Mitch

----- Original Message -----
From: "Steve Wolfe" <steve@iboats.com>
To: <pgsql-general@postgresql.org>
Sent: Friday, February 16, 2001 3:10 PM
Subject: Re: order of clauses

will give a divide by zero error as A=(y>0) and B=(x/y>1) can be

evaluated in

any order (A and B = B and A). I obviously would like (y>0) to happen

first,

but I don't see how this can be achieved.. Any ideas?

I have one idea that would be nifty to implement. In some compilers,

you

Show quoted text

can turn off complete boolean checking. As soon as any part of an
expression will invalidate the expression, it stops evaluating all of it.
That can help you avoid division by zero, and keeps you from evaluating
parts of the expression that don't matter. It sounds like a good idea, at
least to an ignoramus like me. : )

steve

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#7)
Re: order of clauses

Peter Eisentraut <peter_e@gmx.net> writes:

Subselects in FROM are currently not the most performance-encouraging way
to write a query (explicit JOINs might use more efficient plans), but
setting up a view is going to buy you zero because a query from a view is
just going to end up being processed like a subselect in FROM.

Actually there's more to it than that. A view will indeed be expanded
into something that looks exactly like a subselect-in-FROM (think of the
view as a macro that gets expanded). But downstream of that, the
planner will try to "pull up" the subselect into the main query if it's
simple enough. If the pullup is successful, then there's no performance
penalty to having written a view rather than an explicit join. What's
more, because it's done that way, the same applies to explicitly written
subselect-in-FROM.

For example, consider this literal join:

regression=# explain select * from int8_tbl a, int8_tbl b where a.q1=b.q2;
NOTICE: QUERY PLAN:

Merge Join (cost=2.22..2.34 rows=5 width=32)
-> Sort (cost=1.11..1.11 rows=5 width=16)
-> Seq Scan on int8_tbl a (cost=0.00..1.05 rows=5 width=16)
-> Sort (cost=1.11..1.11 rows=5 width=16)
-> Seq Scan on int8_tbl b (cost=0.00..1.05 rows=5 width=16)

EXPLAIN

If you recast this with a subselect, it still gets the same plan because
the subselect is absorbed into the upper query:

regression=# explain select * from int8_tbl a,
regression-# (select * from int8_tbl) b where a.q1=b.q2;
NOTICE: QUERY PLAN:

Merge Join (cost=2.22..2.34 rows=5 width=32)
-> Sort (cost=1.11..1.11 rows=5 width=16)
-> Seq Scan on int8_tbl a (cost=0.00..1.05 rows=5 width=16)
-> Sort (cost=1.11..1.11 rows=5 width=16)
-> Seq Scan on int8_tbl (cost=0.00..1.05 rows=5 width=16)

EXPLAIN

However the planner is not currently bright enough to pull up a subquery
with, say, an ORDER BY:

regression=# explain select * from int8_tbl a,
regression-# (select * from int8_tbl order by q1) b where a.q1=b.q2;
NOTICE: QUERY PLAN:

Merge Join (cost=2.27..2.40 rows=5 width=32)
-> Sort (cost=1.11..1.11 rows=5 width=16)
-> Seq Scan on int8_tbl a (cost=0.00..1.05 rows=5 width=16)
-> Sort (cost=1.17..1.17 rows=5 width=16)
-> Subquery Scan b (cost=1.11..1.11 rows=5 width=16)
-> Sort (cost=1.11..1.11 rows=5 width=16)
-> Seq Scan on int8_tbl (cost=0.00..1.05 rows=5 width=16)

EXPLAIN
regression=#

In this case the outer join plan doesn't change, but it might if we were
considering something where a nestloop with inner indexscan would have
been the best plan. An indexscan has to be on a base relation, not on a
SubqueryScan node, so the planner would fail to discover the best plan
without the pullup.

The results would be the same if I'd done CREATE VIEWS rather than
writing explicit subselect-in-FROM.

regards, tom lane