index not used for bigint without explicit cast

Started by Sam.Meshabout 3 years ago6 messagesbugs
Jump to latest
#1Sam.Mesh
Sam.Mesh@gmail.com

Hello,

"index not used for bigint without explicit cast" is very old limitation
mentioned in 2003 at
/messages/by-id/1054043810.93507.1.camel@jester

Here is more recent mention of similar limitation in 2021
https://www.gojek.io/blog/the-case-s-of-postgres-not-using-index
***
The problem arrises with the other case, where the values from the table
are to be converted to match the datatype of the query (i.e., we give
numeric in query and the column is of bigint type). As Postgres can only
cast a bigint to numeric, the only option it has is to convert every row in
the table to numeric and then compare. Thus in this case, the index won’t
be used.
***

Probably, somebody knows the current state of this limitation?

Thanks,
Sam

In reply to: Sam.Mesh (#1)
Re: index not used for bigint without explicit cast

On Tue, Jan 17, 2023 at 5:44 PM Sam.Mesh <Sam.Mesh@gmail.com> wrote:

The problem arrises with the other case, where the values from the table are to be converted to match the datatype of the query (i.e., we give numeric in query and the column is of bigint type). As Postgres can only cast a bigint to numeric, the only option it has is to convert every row in the table to numeric and then compare. Thus in this case, the index won’t be used.
***

Probably, somebody knows the current state of this limitation?

Not all numeric values can be converted to int8 without loss of
precision. If it was allowed, it would create subtle problems. The
same is not true for (say) int4 and int8, which can be mixed in the
way that you would expect.

Internally, int4 and int8 are part of the same btree operator family,
and so follow certain rules which are described here:

https://www.postgresql.org/docs/current/btree-behavior.html

--
Peter Geoghegan

#3Sam.Mesh
Sam.Mesh@gmail.com
In reply to: Peter Geoghegan (#2)
Re: index not used for bigint without explicit cast

Peter, thank you for clarification.
Could you please double check the following reasoning based on
https://www.postgresql.org/docs/current/btree-behavior.html?
- Index search by bigint column requires conversion of limiting
expressions to bigint type.
- Conversion from number(19,0) to bigint may cause overflow.
- So, index search is not possible.

Show quoted text

On Tue, Jan 17, 2023 at 5:56 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Jan 17, 2023 at 5:44 PM Sam.Mesh <Sam.Mesh@gmail.com> wrote:

The problem arrises with the other case, where the values from the table are to be converted to match the datatype of the query (i.e., we give numeric in query and the column is of bigint type). As Postgres can only cast a bigint to numeric, the only option it has is to convert every row in the table to numeric and then compare. Thus in this case, the index won’t be used.
***

Probably, somebody knows the current state of this limitation?

Not all numeric values can be converted to int8 without loss of
precision. If it was allowed, it would create subtle problems. The
same is not true for (say) int4 and int8, which can be mixed in the
way that you would expect.

Internally, int4 and int8 are part of the same btree operator family,
and so follow certain rules which are described here:

https://www.postgresql.org/docs/current/btree-behavior.html

--
Peter Geoghegan

#4Noah Misch
noah@leadboat.com
In reply to: Sam.Mesh (#3)
Re: index not used for bigint without explicit cast

On Wed, Jan 18, 2023 at 01:43:43PM -0800, Sam.Mesh wrote:

Peter, thank you for clarification.
Could you please double check the following reasoning based on
https://www.postgresql.org/docs/current/btree-behavior.html?
- Index search by bigint column requires conversion of limiting
expressions to bigint type.
- Conversion from number(19,0) to bigint may cause overflow.
- So, index search is not possible.

Essentially, yes. This is a query planner limitation, not a fundamental
property of the search problem. "bigintcol = '5.1'::numeric" is equivalent to
"bigintcol <> bigintcol" or "CASE WHEN bigintcol IS NULL THEN NULL ELSE false
END". In contexts that don't distinguish "false" from NULL, it's equivalent
to constant "false". Those observations apply to any search value such that
'search_value'::numeric::bigint::numeric <> 'search_value'::numeric does not
return true, including overflow-error cases. Like many datatype-specific
tricks, the planner isn't aware at this time.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#4)
Re: index not used for bigint without explicit cast

Noah Misch <noah@leadboat.com> writes:

On Wed, Jan 18, 2023 at 01:43:43PM -0800, Sam.Mesh wrote:

Could you please double check the following reasoning based on
https://www.postgresql.org/docs/current/btree-behavior.html?
- Index search by bigint column requires conversion of limiting
expressions to bigint type.
- Conversion from number(19,0) to bigint may cause overflow.
- So, index search is not possible.

Essentially, yes. This is a query planner limitation, not a fundamental
property of the search problem. "bigintcol = '5.1'::numeric" is equivalent to
"bigintcol <> bigintcol" or "CASE WHEN bigintcol IS NULL THEN NULL ELSE false
END". In contexts that don't distinguish "false" from NULL, it's equivalent
to constant "false". Those observations apply to any search value such that
'search_value'::numeric::bigint::numeric <> 'search_value'::numeric does not
return true, including overflow-error cases. Like many datatype-specific
tricks, the planner isn't aware at this time.

Hmm ... I don't think the planner would be the place to try to change
this. The fundamental thing to do if you wanted to improve this case
would be to invent the "bigint = numeric" operator (and its whole
family, such as "numeric = bigint", "bigint < numeric", "smallint <
numeric", etc etc) and make those part of the integer_ops opfamily.
Which'd probably lead to merging integer_ops and numeric_ops into
a single opfamily.

As far as I can think at the moment, there are not any insurmountable
semantic obstacles to doing that. There are good reasons not to try
to merge integer and float opfamilies, namely that transitivity of
equality would fail because of inexact conversions; but I can't see
how that would occur between integer-family types and numeric, with
all of those being exact types.

Nonetheless, I'm not eager to try to do it. The practical hazards
are at least two: integer vs. numeric comparison operators will be
unpleasantly slow, and adding dozens more identically-named operators
will increase the risks of getting "ambiguous operator" errors in the
parser.

In the end I'd ask why is this a problem. If you converted your
tables from Oracle-ish number(19,0) to bigint, and did not convert
your application to use bigint instead of number(19,0), that sounds
like self-inflicted damage. Do both or neither.

regards, tom lane

#6Sam.Mesh
Sam.Mesh@gmail.com
In reply to: Tom Lane (#5)
Re: index not used for bigint without explicit cast

I've forgotten to say thanks. Thank you everybody for clarifications,
especially to Tom Lane.

Show quoted text

On Wed, Jan 18, 2023 at 9:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Noah Misch <noah@leadboat.com> writes:

On Wed, Jan 18, 2023 at 01:43:43PM -0800, Sam.Mesh wrote:

Could you please double check the following reasoning based on
https://www.postgresql.org/docs/current/btree-behavior.html?
- Index search by bigint column requires conversion of limiting
expressions to bigint type.
- Conversion from number(19,0) to bigint may cause overflow.
- So, index search is not possible.

Essentially, yes. This is a query planner limitation, not a fundamental
property of the search problem. "bigintcol = '5.1'::numeric" is equivalent to
"bigintcol <> bigintcol" or "CASE WHEN bigintcol IS NULL THEN NULL ELSE false
END". In contexts that don't distinguish "false" from NULL, it's equivalent
to constant "false". Those observations apply to any search value such that
'search_value'::numeric::bigint::numeric <> 'search_value'::numeric does not
return true, including overflow-error cases. Like many datatype-specific
tricks, the planner isn't aware at this time.

Hmm ... I don't think the planner would be the place to try to change
this. The fundamental thing to do if you wanted to improve this case
would be to invent the "bigint = numeric" operator (and its whole
family, such as "numeric = bigint", "bigint < numeric", "smallint <
numeric", etc etc) and make those part of the integer_ops opfamily.
Which'd probably lead to merging integer_ops and numeric_ops into
a single opfamily.

As far as I can think at the moment, there are not any insurmountable
semantic obstacles to doing that. There are good reasons not to try
to merge integer and float opfamilies, namely that transitivity of
equality would fail because of inexact conversions; but I can't see
how that would occur between integer-family types and numeric, with
all of those being exact types.

Nonetheless, I'm not eager to try to do it. The practical hazards
are at least two: integer vs. numeric comparison operators will be
unpleasantly slow, and adding dozens more identically-named operators
will increase the risks of getting "ambiguous operator" errors in the
parser.

In the end I'd ask why is this a problem. If you converted your
tables from Oracle-ish number(19,0) to bigint, and did not convert
your application to use bigint instead of number(19,0), that sounds
like self-inflicted damage. Do both or neither.

regards, tom lane