optimizing impossible matches

Started by Merlin Moncureabout 22 years ago3 messageshackers
Jump to latest
#1Merlin Moncure
merlin.moncure@rcsonline.com

Dear hackers:
Is it reasonable or possible to have the optimizer filter out impossible
string/numeric matches because of length/overflow conditions?

For example: (on 7.4.1)

select * from t where f = '1234567' -- f = char(6)
explain says index/seq scan, depending on presence of index, when
obviously there can be no matches.

also, for the typical case for indexed field f,
select * from t where f = f' or false
generates an index scan

but
select * from t where f = f' or f = f''
generates a seq. scan with default settings. If f'' is not in the
domain of f, the first case should apply. Is this really as simple as
it seems?

Merlin

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#1)
Re: optimizing impossible matches

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:

Is this really as simple as it seems?

No.

The optimizer has no knowledge of specific operators except what it
finds in the system catalogs. It has no way in general to determine
that a comparison involving nonconstant values must always fail.
Even if we could do it, I am dubious that it would be worth expending
the cycles on every query to determine whether the situation holds.
AFAICS those would be wasted cycles on the huge majority of queries.

BTW, your claim here is simply false:

also, for the typical case for indexed field f,
select * from t where f = f' or false
generates an index scan
but
select * from t where f = f' or f = f''
generates a seq. scan with default settings.

Depending on the statistics of the column f, the optimizer might think
that a seqscan is better than a double indexscan, but it's not some kind
of hardwired behavior. I don't even think it's "typical" behavior.

regards, tom lane

#3Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Tom Lane (#2)
Re: optimizing impossible matches

Tom Lane wrote:

The optimizer has no knowledge of specific operators except what it
finds in the system catalogs. It has no way in general to determine
that a comparison involving nonconstant values must always fail.
Even if we could do it, I am dubious that it would be worth expending
the cycles on every query to determine whether the situation holds.
AFAICS those would be wasted cycles on the huge majority of queries.

Ok, fair enough. And queries that did have this issue could be easily
rewritten...

Still, there is something that doesn't site quite right with me...my
problems is with SQL really, not Postgres. For example, the server
forbids 'abcd'::char(3) but allows 'abcd' > char(3) because the operator
is not bound to the specific type, but to the general type and ignores
type constraints. In other words, SQL implicitly allows comparison
between objects of different domains if the domains differ only by
constraint (including size).

Anyways, thanks for taking the time to answer.
Merlin