Definitive answer: can functions use indexes?

Started by Seamus Abshereover 10 years ago7 messagesgeneral
Jump to latest
#1Seamus Abshere
seamus@abshere.net

hi,

I've been using Postgres for years ( :heart: ) and I'm still in doubt
about this. Would somebody provide an authoritative, definitive,
narrative answer?

-> Can a function like `LEFT()` use an index?

(Or do I have to find an "equivalent" operator in order to leverage
indexes?)

Thanks!
Seamus

--
Seamus Abshere, SCEA
+598 99 54 99 54
https://github.com/seamusabshere

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Seamus Abshere (#1)
Re: Definitive answer: can functions use indexes?

On 1/6/16 5:15 PM, Seamus Abshere wrote:

I've been using Postgres for years ( :heart: ) and I'm still in doubt
about this. Would somebody provide an authoritative, definitive,
narrative answer?

-> Can a function like `LEFT()` use an index?

(Or do I have to find an "equivalent" operator in order to leverage
indexes?)

If you're looking for magic here, there is none.

CREATE INDEX ON a(field);
... WHERE field = LEFT(...) -- can use index
... WHERE LEFT(field) = ... -- can NOT use index

CREATE INDEX ON a(LEFT(field,5))
... WHERE field = LEFT(...) -- can NOT use index
... WHERE LEFT(field,5) = ... -- CAN use index
... WHERE LEFT(field,6) = ... -- can NOT use index
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Seamus Abshere (#1)
Re: Definitive answer: can functions use indexes?

Seamus Abshere <seamus@abshere.net> writes:

I've been using Postgres for years ( :heart: ) and I'm still in doubt
about this. Would somebody provide an authoritative, definitive,
narrative answer?

-> Can a function like `LEFT()` use an index?

To do what?

Since the question makes little sense as stated, I'm going to assume
you mean "can a query like SELECT ... WHERE left(foo, 3) = 'bar'
use an index on column foo?"

The answer to that is no, there is no such optimization built into
Postgres. (In principle there could be, but I've not heard enough
requests to make me think we'd ever pursue it.)

The equivalent optimization that *is* built in, and has been for
a long time, is for LIKE: "SELECT ... WHERE foo LIKE 'bar%'" can
use an index on foo, at least if it's an index sorted according to
C collation.

Another answer, which might serve as long as your application only
cares about a small number of prefix lengths, is functional indexes.
If you create a functional index on "left(foo,3)" you're all set.
This won't scale well to a whole bunch of different lengths, though.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Seamus Abshere
seamus@abshere.net
In reply to: Tom Lane (#3)
Re: Definitive answer: can functions use indexes?

On Wed, Jan 6, 2016, at 08:41 PM, Tom Lane wrote:

Seamus Abshere <seamus@abshere.net> writes:

-> Can a function like `LEFT()` use an index?

Since the question makes little sense as stated, I'm going to assume
you mean "can a query like SELECT ... WHERE left(foo, 3) = 'bar'
use an index on column foo?"

The answer to that is no, there is no such optimization built into
Postgres. (In principle there could be, but I've not heard enough
requests to make me think we'd ever pursue it.)

The equivalent optimization that *is* built in, and has been for
a long time, is for LIKE: "SELECT ... WHERE foo LIKE 'bar%'" can
use an index on foo, at least if it's an index sorted according to
C collation.

hi Tom,

I should have been more general. In layman's/narrative terms, what's the
deal with functions vs. operators for postgres indexes?

For example, `exist(hstore,text)` vs. `hstore ? text` ?

Thank you!
Seamus

PS. If I have understood correctly over the years, in order for the
query planner to use indexes, it needs to see operators - functions are
opaque to it. I'm looking for a bit more narrative on this to round out
my understanding.

--
Seamus Abshere, SCEA
+598 99 54 99 54
https://github.com/seamusabshere

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Seamus Abshere (#4)
Re: Definitive answer: can functions use indexes?

Seamus Abshere <seamus@abshere.net> writes:

I should have been more general. In layman's/narrative terms, what's the
deal with functions vs. operators for postgres indexes?

For example, `exist(hstore,text)` vs. `hstore ? text` ?

Yeah. exist(hstore,text) and hstore?text may yield the same result,
but only the latter is a candidate to be used with an index on an hstore
column. This is a consequence of decisions that were made twenty-five or
more years ago at Berkeley, to design the core system's interface to index
support in terms of operators and operator classes (there's a reason those
are not called "function classes"). At this point, those decisions are
so heavily embedded --- into not only the core code but perhaps hundreds
of third-party extensions --- that rethinking them would be very painful.
As long as the gain is only likely to be cosmetic, it probably won't
happen.

You can see some info about what I'm talking about here:
http://www.postgresql.org/docs/devel/static/xindex.html

A closely related issue is that most of the planner's optimization
intelligence is tied to operators, not functions, as shown here:
http://www.postgresql.org/docs/devel/static/xoper-optimization.html

Again, that's something that could be improved in principle, but
the amount of work involved seems disproportionate to the likely
benefit.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#3)
Re: Definitive answer: can functions use indexes?

On 1/6/16 5:41 PM, Tom Lane wrote:

Since the question makes little sense as stated, I'm going to assume
you mean "can a query like SELECT ... WHERE left(foo, 3) = 'bar'
use an index on column foo?"

The answer to that is no, there is no such optimization built into
Postgres. (In principle there could be, but I've not heard enough
requests to make me think we'd ever pursue it.)

BTW, the case where this would be highly valuable is timestamps. Being
able to do something like date_part('month',timestamptz)='Jan' would be
a big, big deal for warehousing.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Alban Hertroys
haramrae@gmail.com
In reply to: Jim Nasby (#6)
Re: Definitive answer: can functions use indexes?

On 07 Jan 2016, at 5:19, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:

On 1/6/16 5:41 PM, Tom Lane wrote:

Since the question makes little sense as stated, I'm going to assume
you mean "can a query like SELECT ... WHERE left(foo, 3) = 'bar'
use an index on column foo?"

The answer to that is no, there is no such optimization built into
Postgres. (In principle there could be, but I've not heard enough
requests to make me think we'd ever pursue it.)

BTW, the case where this would be highly valuable is timestamps. Being able to do something like date_part('month',timestamptz)='Jan' would be a big, big deal for warehousing.

Not just warehousing, for BI in general.

But, as is now quite clear, for many of those cases it should be fairly trivial to work around this limitation by creating either a functional index or an operator. For the above example, say something like timestamptz % 'month' = 'Jan'.

There are downsides to that approach though, such as readability and that this way of using % instead of date_part() is not according to any standard behaviour and could even behave differently or (more likely) not work at all on other PG instances.

That said, it's not uncommon in BI to require a seq. scan anyway, in which case the point is rather moot.

Regards,

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general