missing optimization - column <> column
Hi
I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simple optimization
create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;
It does full scan of foo, although it should be replaced by false in
planner time.
Same issue is a expression a = a .. can be replaced by true
Regards
Pavel
Pavel,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simple optimizationcreate table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;It does full scan of foo, although it should be replaced by false in
planner time.
a <> a could go to NULL. Obviously, that'll be false for such a simple
case, but it might not work out that way in a more complicated WHERE
clause.
Same issue is a expression a = a .. can be replaced by true
a = a can't be replaced unless you know that 'a' can't be NULL.
In short, fix the application.
Thanks!
Stephen
Pavel Stehule <pavel.stehule@gmail.com> writes:
I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simple optimization
create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;
It does full scan of foo, although it should be replaced by false in
planner time.
Same issue is a expression a = a .. can be replaced by true
Wrong; those expressions yield NULL for NULL input. You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.
If you write something like "1 <> 1", it will be folded.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-12-05 16:24 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simpleoptimization
create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;It does full scan of foo, although it should be replaced by false in
planner time.Same issue is a expression a = a .. can be replaced by true
Wrong; those expressions yield NULL for NULL input. You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.
understand
If you write something like "1 <> 1", it will be folded.
it works, but a <> a not
Regards
Pavel
Show quoted text
regards, tom lane
2016-12-05 16:23 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
Pavel,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simpleoptimization
create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;It does full scan of foo, although it should be replaced by false in
planner time.a <> a could go to NULL. Obviously, that'll be false for such a simple
case, but it might not work out that way in a more complicated WHERE
clause.
it should be false everywhere
I don't defend a design of the application - it is exactly wrong. But
sometimes, it can be generated by some tool or it can be a human error. And
bad performance can be a big problems on systems, where you cannot to
deploy fix simply.
It is hard to say what should be good design - because these queries are
slow, I know so these queries are wrong, but these queries does significant
IO utilization - and I cannot to fix the application, because I am not a
author and the fix will not be available in next week.
Regards
Pavel
Show quoted text
Same issue is a expression a = a .. can be replaced by true
a = a can't be replaced unless you know that 'a' can't be NULL.
In short, fix the application.
Thanks!
Stephen
Pavel,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
2016-12-05 16:23 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simpleoptimization
create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;It does full scan of foo, although it should be replaced by false in
planner time.a <> a could go to NULL. Obviously, that'll be false for such a simple
case, but it might not work out that way in a more complicated WHERE
clause.it should be false everywhere
No, it's NULL, not false, if 'a' is NULL:
=# SELECT 1 WHERE (NULL <> NULL) IS NULL;
?column?
----------
1
(1 row)
=*# SELECT 1 WHERE (FALSE) IS NULL;
?column?
----------
(0 rows)
You can not make the assumption that 'a <> a' is always false.
Thanks!
Stephen
2016-12-05 16:41 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
Pavel,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
2016-12-05 16:23 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
I found some crazy queries in one customer application. These
queries are
stupid, but it was surprise for me so there are not some simple
optimization
create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;It does full scan of foo, although it should be replaced by false in
planner time.a <> a could go to NULL. Obviously, that'll be false for such a simple
case, but it might not work out that way in a more complicated WHERE
clause.it should be false everywhere
No, it's NULL, not false, if 'a' is NULL:
=# SELECT 1 WHERE (NULL <> NULL) IS NULL;
?column?
----------
1
(1 row)=*# SELECT 1 WHERE (FALSE) IS NULL;
?column?
----------
(0 rows)You can not make the assumption that 'a <> a' is always false.
ok - when the expression is tested on NULL, then a <> a should not be
reduced. But when there are not this test, then I can replace this
expression by false.
Regards
Pavel
Show quoted text
Thanks!
Stephen
On Mon, Dec 5, 2016 at 3:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Pavel Stehule <pavel.stehule@gmail.com> writes:
I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simpleoptimization
create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;It does full scan of foo, although it should be replaced by false in
planner time.Same issue is a expression a = a .. can be replaced by true
Wrong; those expressions yield NULL for NULL input. You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.If you write something like "1 <> 1", it will be folded.
regards, tom lane
Would it be worth replacing the condition with the equivalent?
I mean would that help optimizing better some queries when it knows that a
is (not) nullable or when "a" is more complicated expression?
a <> a : (a IS NULL) AND NULL
a = a : (a IS NOT NULL) OR NULL
Pantelis Theodosiou
Would it be worth replacing the condition with the equivalent?
I mean would that help optimizing better some queries when it knows that a
is (not) nullable or when "a" is more complicated expression?a <> a : (a IS NULL) AND NULL
a = a : (a IS NOT NULL) OR NULL
I think you're looking for
a IS DISTINCT FROM a
And that will work for cases where a might be null.
I have no opinion about whether adding such a test to the planner is worth
it.
Actually there are lots of things that can be done with this sort of theorem proving.
And NULL is a plenty good answer for a filter, just not for a check constraint.
Amongst them INSERT through UNION ALL for symmetric views which can be handy for FDW partitioned tables.
One such implementation an be found here:
https://www.google.com/patents/US6728952 (apparently expired)
Cheers
Serge
Salesforce.com
Show quoted text
On Dec 5, 2016, at 7:28 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2016-12-05 16:24 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>>:
Pavel Stehule <pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> writes:I found some crazy queries in one customer application. These queries are
stupid, but it was surprise for me so there are not some simple optimizationcreate table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;It does full scan of foo, although it should be replaced by false in
planner time.Same issue is a expression a = a .. can be replaced by true
Wrong; those expressions yield NULL for NULL input. You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.understand
If you write something like "1 <> 1", it will be folded.
it works, but a <> a not
Regards
Pavel
regards, tom lane
On Mon, Dec 5, 2016 at 7:02 PM, Corey Huinker <corey.huinker@gmail.com>
wrote:
Would it be worth replacing the condition with the equivalent?
I mean would that help optimizing better some queries when it knows that
a is (not) nullable or when "a" is more complicated expression?a <> a : (a IS NULL) AND NULL
a = a : (a IS NOT NULL) OR NULLI think you're looking for
a IS DISTINCT FROM a
And that will work for cases where a might be null.
I have no opinion about whether adding such a test to the planner is worth
it.
No, (a IS DISTINCT FROM a) will evaluate to FALSE when a is NULL. The
other conditions (a <> a) , ((a IS NULL) AND NULL) will evaluate to NULL.