like performance w/o wildcards.

Started by Joseph Shraibmanover 22 years ago16 messagesgeneral
Jump to latest
#1Joseph Shraibman
jks@selectacast.net

I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
and is in reality an '='. Is this an easy change to make?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joseph Shraibman (#1)
Re: like performance w/o wildcards.

Joseph Shraibman <jks@selectacast.net> writes:

I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
and is in reality an '='. Is this an easy change to make?

On what do you base that conclusion?

regression=# create table t1 (f1 text unique);
NOTICE: CREATE TABLE / UNIQUE will create implicit index 't1_f1_key' for table 't1'
CREATE TABLE
regression=# explain select * from t1 where f1 like 'foo';
QUERY PLAN
---------------------------------------------------------------------
Index Scan using t1_f1_key on t1 (cost=0.00..4.82 rows=1 width=32)
Index Cond: (f1 = 'foo'::text)
Filter: (f1 ~~ 'foo'::text)
(3 rows)

regression=#

regards, tom lane

#3Joseph Shraibman
jks@selectacast.net
In reply to: Tom Lane (#2)
Re: like performance w/o wildcards.

Hmm. I didn't work for me. I'll try and figure this out.

Tom Lane wrote:

Show quoted text

Joseph Shraibman <jks@selectacast.net> writes:

I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
and is in reality an '='. Is this an easy change to make?

On what do you base that conclusion?

regression=# create table t1 (f1 text unique);
NOTICE: CREATE TABLE / UNIQUE will create implicit index 't1_f1_key' for table 't1'
CREATE TABLE
regression=# explain select * from t1 where f1 like 'foo';
QUERY PLAN
---------------------------------------------------------------------
Index Scan using t1_f1_key on t1 (cost=0.00..4.82 rows=1 width=32)
Index Cond: (f1 = 'foo'::text)
Filter: (f1 ~~ 'foo'::text)
(3 rows)

regression=#

regards, tom lane

#4Richard Huxton
dev@archonet.com
In reply to: Joseph Shraibman (#3)
Re: like performance w/o wildcards.

On Monday 04 August 2003 04:29, Joseph Shraibman wrote:

Hmm. I didn't work for me. I'll try and figure this out.

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
and is in reality an '='. Is this an easy change to make?

Check your locale/encoding - you probably want "C" or similar for LIKE to use
an index.
--
Richard Huxton
Archonet Ltd

#5Joseph Shraibman
jks@selectacast.net
In reply to: Richard Huxton (#4)
Re: like performance w/o wildcards.

Richard Huxton wrote:

On Monday 04 August 2003 04:29, Joseph Shraibman wrote:

Hmm. I didn't work for me. I'll try and figure this out.

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
and is in reality an '='. Is this an easy change to make?

Check your locale/encoding - you probably want "C" or similar for LIKE to use
an index.

I'm using latin1. Why should it make a difference?

#6Bruce Momjian
bruce@momjian.us
In reply to: Joseph Shraibman (#1)
Re: like performance w/o wildcards.

Well, I don't see that = would be significantly faster than LIKE with a
no-wildcard string, so I don't see the value in having LIKE detect
non-whildcard strings.

---------------------------------------------------------------------------

Joseph Shraibman wrote:

I notice in 7.3.3 the planner can't tell when a LIKE has no wildcards
and is in reality an '='. Is this an easy change to make?

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#7Joseph Shraibman
jks@selectacast.net
In reply to: Bruce Momjian (#6)
Re: like performance w/o wildcards.

Bruce Momjian wrote:

Well, I don't see that = would be significantly faster than LIKE with a
no-wildcard string, so I don't see the value in having LIKE detect
non-whildcard strings.

Because it won't use an index for a LIKE, only a seqscan.

#8Bruce Momjian
bruce@momjian.us
In reply to: Joseph Shraibman (#7)
Re: like performance w/o wildcards.

Joseph Shraibman wrote:

Bruce Momjian wrote:

Well, I don't see that = would be significantly faster than LIKE with a
no-wildcard string, so I don't see the value in having LIKE detect
non-whildcard strings.

Because it won't use an index for a LIKE, only a seqscan.

Oh, that's interesting. I think a LIKE will already use an index,
except for non-C locales. Is that the issue?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#6)
Re: like performance w/o wildcards.

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Well, I don't see that = would be significantly faster than LIKE with a
no-wildcard string, so I don't see the value in having LIKE detect
non-whildcard strings.

You forgot about indexing.

regards, tom lane

#10Joseph Shraibman
jks@selectacast.net
In reply to: Tom Lane (#9)
Re: like performance w/o wildcards.

Bruce Momjian wrote:

Joseph Shraibman wrote:

Bruce Momjian wrote:

Joseph Shraibman wrote:

Bruce Momjian wrote:

Oh, that's interesting. I think a LIKE will already use an index,
except for non-C locales. Is that the issue?

Yes, I'm using latin1. Why does an index only work on C?

Because we can't determine what caracters are before/after a given
character to do a restriction, e.g. col LIKE 'F*' add col >= 'F' AND col
< 'G'. In non-C, we don't know the ordering.

Then what is the index for?

The index is for non-LIKE comparisons, like = and >.

My point is what is the difference? If the index is done char by char then LIKE ought to
be able to use the index to find 'F*', no matter what the locale is. What can't you
figure out what is before/after a given char? Don't you have that information? Don't you
need it to create the index in the first place?

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joseph Shraibman (#10)
Re: like performance w/o wildcards.

Joseph Shraibman <jks@selectacast.net> writes:

The index is for non-LIKE comparisons, like = and >.

My point is what is the difference? If the index is done char by char then LIKE ought to
be able to use the index to find 'F*', no matter what the locale is. What can't you
figure out what is before/after a given char? Don't you have that information? Don't you
need it to create the index in the first place?

<yawn> Read the archives. We would love to use locale-sorted indexes
for LIKE, but we *can't*. There are too many bizarre sorting rules.
(Hint: almost no locale does its sorting purely "char by char".)

regards, tom lane

#12Joseph Shraibman
jks@selectacast.net
In reply to: Tom Lane (#11)
Re: like performance w/o wildcards.

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

The index is for non-LIKE comparisons, like = and >.

My point is what is the difference? If the index is done char by char then LIKE ought to
be able to use the index to find 'F*', no matter what the locale is. What can't you
figure out what is before/after a given char? Don't you have that information? Don't you
need it to create the index in the first place?

<yawn> Read the archives. We would love to use locale-sorted indexes
for LIKE, but we *can't*. There are too many bizarre sorting rules.
(Hint: almost no locale does its sorting purely "char by char".)

regards, tom lane

What percentage of locales have this problem? Does latin1 have this problem?

And what about my original idea, can LIKE be turned into an = when there are no wildcards?

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joseph Shraibman (#12)
Re: like performance w/o wildcards.

Joseph Shraibman <jks@selectacast.net> writes:

What percentage of locales have this problem? Does latin1 have this problem?

Latin1 is an encoding, not a locale. To a first approximation, I'd say
*all* non-C locales have some kind of sorting funny business.

And what about my original idea, can LIKE be turned into an = when there are no wildcards?

It does ... if the index-conversion optimization is enabled at all. Not
sure it's worth special-casing the '=' case otherwise.

regards, tom lane

#14Joseph Shraibman
jks@selectacast.net
In reply to: Tom Lane (#13)
Re: like performance w/o wildcards.

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

What percentage of locales have this problem? Does latin1 have this problem?

Latin1 is an encoding, not a locale. To a first approximation, I'd say
*all* non-C locales have some kind of sorting funny business.

OK this clears things up a bit. The locale on my production server (redhat) is set to
en_US, which explains why LIKE doesn't use an index. Do I just have to reset the locale
environment variable and restart postgres? What might the side effects of that be?

And what about my original idea, can LIKE be turned into an = when there are no wildcards?

It does ... if the index-conversion optimization is enabled at all.

Sorry, what is 'index-conversion optimization' and when is it enabled?

#15scott.marlowe
scott.marlowe@ihs.com
In reply to: Joseph Shraibman (#14)
Re: like performance w/o wildcards.

On Mon, 4 Aug 2003, Joseph Shraibman wrote:

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

What percentage of locales have this problem? Does latin1 have this problem?

Latin1 is an encoding, not a locale. To a first approximation, I'd say
*all* non-C locales have some kind of sorting funny business.

OK this clears things up a bit. The locale on my production server (redhat) is set to
en_US, which explains why LIKE doesn't use an index. Do I just have to reset the locale
environment variable and restart postgres? What might the side effects of that be?

Nope, changing locales involves dumping reinitting and restoring. Sorry.
:(

And what about my original idea, can LIKE be turned into an = when there are no wildcards?

It does ... if the index-conversion optimization is enabled at all.

Sorry, what is 'index-conversion optimization' and when is it enabled?

I don't know what that is either. Tom?

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: scott.marlowe (#15)
Re: like performance w/o wildcards.

"scott.marlowe" <scott.marlowe@ihs.com> writes:

It does ... if the index-conversion optimization is enabled at all.

Sorry, what is 'index-conversion optimization' and when is it enabled?

I don't know what that is either. Tom?

Sorry, I just meant the code that tries to extract indexable conditions
from a LIKE clause. With a fixed-prefix pattern, eg "x LIKE 'foo%'",
you get a range condition like "x >= 'foo' AND x < 'fop'". For a
completely fixed pattern this degenerates to "x = 'foo'". The whole
thing turns off in non-C locales though.

regards, tom lane