Case insensitive selects?

Started by David Reidabout 25 years ago26 messagesgeneral
Jump to latest
#1David Reid
dreid@jetnet.co.uk

Does pgsql support this and how would I do it?

david

#2Adam Lang
aalang@rutgersinsurance.com
In reply to: David Reid (#1)
Re: Case insensitive selects?

It is in the list archives several times.

All you need is to use some basic SQL.

select * from mytable where upper('my criteria') = upper(mytable.info);

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
http://www.rutgersinsurance.com
----- Original Message -----
From: "David Reid" <dreid@jetnet.co.uk>
To: <pgsql-general@postgresql.org>
Sent: Wednesday, February 14, 2001 9:58 AM
Subject: [GENERAL] Case insensitive selects?

Show quoted text

Does pgsql support this and how would I do it?

david

#3Anand Raman
araman@india-today.com
In reply to: Adam Lang (#2)
Re: Case insensitive selects?

Hi
use it if u absolutly need it.. Using a function on a column name
doesnt use the index associated with that column,.. So exercise this
option with some amount of thinking..

Anand

Show quoted text

On Wed, Feb 14, 2001 at 11:39:47AM -0500, Adam Lang wrote:

It is in the list archives several times.

All you need is to use some basic SQL.

select * from mytable where upper('my criteria') = upper(mytable.info);

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
http://www.rutgersinsurance.com
----- Original Message -----
From: "David Reid" <dreid@jetnet.co.uk>
To: <pgsql-general@postgresql.org>
Sent: Wednesday, February 14, 2001 9:58 AM
Subject: [GENERAL] Case insensitive selects?

Does pgsql support this and how would I do it?

david

#4Michael Fork
mfork@toledolink.com
In reply to: Anand Raman (#3)
Re: Case insensitive selects?

Indexes *can* and *will* be used if you create the appropiate
functional indexes, i.e:

CREATE INDEX idx_table_field_upper ON table(upper(field));

SELECT field FROM table WHERE upper(field) LIKE upper('some string');

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

On Thu, 15 Feb 2001, Anand Raman wrote:

Show quoted text

Hi
use it if u absolutly need it.. Using a function on a column name
doesnt use the index associated with that column,.. So exercise this
option with some amount of thinking..

Anand
On Wed, Feb 14, 2001 at 11:39:47AM -0500, Adam Lang wrote:

It is in the list archives several times.

All you need is to use some basic SQL.

select * from mytable where upper('my criteria') = upper(mytable.info);

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
http://www.rutgersinsurance.com
----- Original Message -----
From: "David Reid" <dreid@jetnet.co.uk>
To: <pgsql-general@postgresql.org>
Sent: Wednesday, February 14, 2001 9:58 AM
Subject: [GENERAL] Case insensitive selects?

Does pgsql support this and how would I do it?

david

#5David Wheeler
david@creationengines.com
In reply to: Michael Fork (#4)
Re: Case insensitive selects?

On Thu, 15 Feb 2001, Michael Fork wrote:

Indexes *can* and *will* be used if you create the appropiate
functional indexes, i.e:

CREATE INDEX idx_table_field_upper ON table(upper(field));

SELECT field FROM table WHERE upper(field) LIKE upper('some string');

Hmmm...I'd hate to have two indexes on every field I query like this, one
case-senstive, one case-insensitve (like the one you create here). Is
there a configuration option or something that will tell pgsql to do
case-insensitive comparisons (kinda like MS SQL Server has)? That could
save us on indexing overhead, since we want all of our WHERE comparisons
to be case-insensitive, anyway.

I should also not that we're also using --with-multibyte and having all of
our databases use Unicode exclusively.

Thanks!

David

#6David E. Wheeler
david@kineticode.com
In reply to: David Wheeler (#5)
Re: Case insensitive selects?

On Thu, 15 Feb 2001, Michael Fork wrote:

Indexes *can* and *will* be used if you create the appropiate
functional indexes, i.e:

CREATE INDEX idx_table_field_upper ON table(upper(field));

SELECT field FROM table WHERE upper(field) LIKE upper('some string');

Hmmm...I'd hate to have two indexes on every field I query like this, one
case-senstive, one case-insensitve (like the one you create here). Is
there a configuration option or something that will tell pgsql to do
case-insensitive comparisons (kinda like MS SQL Server has)? That could
save us on indexing overhead, since we want all of our WHERE comparisons
to be case-insensitive, anyway.

I should also not that we're also using --with-multibyte and having all of
our databases use Unicode exclusively.

Thanks!

David

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#6)
Re: Case insensitive selects?

David Wheeler <david@wheeler.net> writes:

Indexes *can* and *will* be used if you create the appropiate
functional indexes, i.e:

CREATE INDEX idx_table_field_upper ON table(upper(field));

SELECT field FROM table WHERE upper(field) LIKE upper('some string');

Hmmm...I'd hate to have two indexes on every field I query like this, one
case-senstive, one case-insensitve (like the one you create here). Is
there a configuration option or something that will tell pgsql to do
case-insensitive comparisons (kinda like MS SQL Server has)? That could
save us on indexing overhead, since we want all of our WHERE comparisons
to be case-insensitive, anyway.

Then why are you bothering to maintain a case-sensitive index?

There's no free lunch available here; if you think there is, then you
are misunderstanding what an index is. Either the index is in
case-sensitive order, or it's not.

regards, tom lane

#8David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#7)
Re: Case insensitive selects?

On Thu, 15 Feb 2001, Tom Lane wrote:

Then why are you bothering to maintain a case-sensitive index?

Because while some queries do a case-insensitive query, others do not, in
the sense that I do not everywhere convert the string to compare to lower
case.

There's no free lunch available here; if you think there is, then you
are misunderstanding what an index is. Either the index is in
case-sensitive order, or it's not.

Well, I think I understand pretty well what an index is. But I don't get
that the earlier example was of a case-insensitive index, but of an index
where all the entries were forced into lower case (or upper case, as the
case may be [pun not intended]). Thus, if I have this index:

CREATE INDEX idx_mime_type__name ON mime_type(LOWER(name));

and I execute this query:

SELECT *
FROM mime_type
WHERE name = 'text/HTML';

Will it use the index I created above or not? I'm assuming not unless I
rewrite the query like this:

SELECT *
FROM mime_type
WHERE name = LOWER('text/HTML');

But then I wouldn't call the index I created "case-insensitive."

But I would be happy to know if I'm missing something here.

Thanks,

David

#9Michael Fork
mfork@toledolink.com
In reply to: David E. Wheeler (#6)
Re: Case insensitive selects?

If you are going to be only doing case-insensitive compares, why would you
have two indexes on the field?

Although I am no guru on PostgreSQL internals or database theory, a
case insensitive select on a mixed case index would not work for the
following reason (correct me if i am wrong):

1) becuase of ASCII values and the way btree indexes are ordered, 'A' and
'a' are not store next to each other, meaning that you cannot map all the
caracters of the index to the same case on the fly w/o missing a chunk of
index (unless you wanted to make multiple passes through the index, which
would negate any speed gains of *not* having multiple indexes becuase of
the exponential growth, i.e. searching for 'that' would require 16 passes
thru -- what, What, wHat, whAt, whaT, etc.)

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

On Thu, 15 Feb 2001, David Wheeler wrote:

Show quoted text

On Thu, 15 Feb 2001, Michael Fork wrote:

Indexes *can* and *will* be used if you create the appropiate
functional indexes, i.e:

CREATE INDEX idx_table_field_upper ON table(upper(field));

SELECT field FROM table WHERE upper(field) LIKE upper('some string');

Hmmm...I'd hate to have two indexes on every field I query like this, one
case-senstive, one case-insensitve (like the one you create here). Is
there a configuration option or something that will tell pgsql to do
case-insensitive comparisons (kinda like MS SQL Server has)? That could
save us on indexing overhead, since we want all of our WHERE comparisons
to be case-insensitive, anyway.

I should also not that we're also using --with-multibyte and having all of
our databases use Unicode exclusively.

Thanks!

David

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#8)
Re: Case insensitive selects?

David Wheeler <david@wheeler.net> writes:

Thus, if I have this index:

CREATE INDEX idx_mime_type__name ON mime_type(LOWER(name));

and I execute this query:

SELECT *
FROM mime_type
WHERE name = 'text/HTML';

Will it use the index I created above or not? I'm assuming not unless I
rewrite the query like this:

SELECT *
FROM mime_type
WHERE name = LOWER('text/HTML');

Not then either; you'd need to write

SELECT *
FROM mime_type
WHERE LOWER(name) = LOWER('text/HTML');

or equivalently

SELECT *
FROM mime_type
WHERE LOWER(name) = 'text/html';

which is what will result from constant-folding anyway.

The details of invocation seem beside the point, however. The point is
that a btree index is all about sort order, and the sort order of data
viewed case-sensitively is quite different from the sort order of
monocased data. Perhaps in an ASCII universe you could play some tricks
to make the same index serve both purposes, but it'll never work in
non-ASCII locales ...

regards, tom lane

#11James Thompson
jamest@math.ksu.edu
In reply to: Tom Lane (#10)
misc psql questions

About a lifetime ago I used to do quite a bit of work w/ Oracle.

It's command line sql tool had some pretty nice features that I haven't
been able to find in psql. I was wondering if any of the following
existed....

I think the first was called break on which altered in output from
something like

name date qty
-------------------------
Fred 01-JAN-2000 10
Fred 10-JAN-2000 13
Fred 01-JUL-2000 1
Fred 01-DEC-2000 100
Bob 01-JAN-2000 5
Bob 10-MAY-2000 10

to

name date qty
-------------------------
Fred 01-JAN-2000 10
10-JAN-2000 13
01-JUL-2000 1
01-DEC-2000 100
Bob 01-JAN-2000 5
10-MAY-2000 10

it also allowed for things like compute sum which would activate on breaks
but I don't recall how they worked.

The other thing I'd love to be able to do is get user input while running
a sql file. I don't recall how this worked exactly but the script would
either accept variables calling the script or prompt for them.

So if I had a sql script in a file named contact_report. And I did

prod=> \i contact_report 01-JAN-2000 31-DEC-2001

then it would load the script and replace IIRC &1 and &2 with the
respective dates listed on command line.

It also had an ACCEPT command that would cause it to prompt for input from
user and assign to a varable name. like

accept amount prompt 'Enter the amount to search for: '
select * from foo where quantity = &amount

These made it very easy to build simple reports and scripts for less
technical end users.

Is any of this possible with psql?

Thanks,
James

->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<
James Thompson 138 Cardwell Hall Manhattan, Ks 66506 785-532-0561
Kansas State University Department of Mathematics
->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<

#12Peter Eisentraut
peter_e@gmx.net
In reply to: James Thompson (#11)
Re: misc psql questions

James Thompson writes:

About a lifetime ago I used to do quite a bit of work w/ Oracle.

It's command line sql tool had some pretty nice features that I haven't
been able to find in psql. I was wondering if any of the following
existed....

I think the first was called break on which altered in output from
something like

name date qty
-------------------------
Fred 01-JAN-2000 10
Fred 10-JAN-2000 13
Fred 01-JUL-2000 1
Fred 01-DEC-2000 100
Bob 01-JAN-2000 5
Bob 10-MAY-2000 10

to

name date qty
-------------------------
Fred 01-JAN-2000 10
10-JAN-2000 13
01-JUL-2000 1
01-DEC-2000 100
Bob 01-JAN-2000 5
10-MAY-2000 10

it also allowed for things like compute sum which would activate on breaks
but I don't recall how they worked.

This seems to be a thing for a report generator. Try pgaccess.

The other thing I'd love to be able to do is get user input while running
a sql file. I don't recall how this worked exactly but the script would
either accept variables calling the script or prompt for them.

So if I had a sql script in a file named contact_report. And I did

prod=> \i contact_report 01-JAN-2000 31-DEC-2001

then it would load the script and replace IIRC &1 and &2 with the
respective dates listed on command line.

You can use \set to set variables.

It also had an ACCEPT command that would cause it to prompt for input from
user and assign to a varable name.

Try
\echo -n 'Prompt: '
\set varname `read input; echo $input`

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

#13David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#10)
Re: Case insensitive selects?

On Thu, 15 Feb 2001, Tom Lane wrote:

Not then either; you'd need to write

SELECT *
FROM mime_type
WHERE LOWER(name) = LOWER('text/HTML');

or equivalently

SELECT *
FROM mime_type
WHERE LOWER(name) = 'text/html';

which is what will result from constant-folding anyway.

Yes, of course; my oversight.

The details of invocation seem beside the point, however. The point is
that a btree index is all about sort order, and the sort order of data
viewed case-sensitively is quite different from the sort order of
monocased data. Perhaps in an ASCII universe you could play some tricks
to make the same index serve both purposes, but it'll never work in
non-ASCII locales ...

Hmmm...somehow, MS gets it to work in SQL Server. Lord knows how (or if
it's effective or fast), but I won't worry about it (since the last
thing I want to do is switch to NT!). I'll just code more carefully per
the examples above to ensure proper index use.

Thanks,

David

#14David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#10)
Re: Case insensitive selects?

Hmmm... I'm trying to create an index,

CREATE INDEX idx_server__host_name ON server(LOWER(host_name));

But it won't create. Here's the error:

ERROR: DefineIndex: function 'upper(varchar)' does not exist

Anyone know what's up with that? The table does have the host_name column
of type VARCHAR.

Thanks,

David

#15David E. Wheeler
david@kineticode.com
In reply to: David E. Wheeler (#14)
Re: Case insensitive selects?

Forgot to mention, I'm using 7.03.

Thanks,

David

On Thu, 15 Feb 2001, David Wheeler wrote:

Show quoted text

Hmmm... I'm trying to create an index,

CREATE INDEX idx_server__host_name ON server(LOWER(host_name));

But it won't create. Here's the error:

ERROR: DefineIndex: function 'upper(varchar)' does not exist

Anyone know what's up with that? The table does have the host_name column
of type VARCHAR.

Thanks,

David

#16Mitch Vincent
mitch@venux.net
In reply to: David Wheeler (#5)
Re: Case insensitive selects?

Hmmm...I'd hate to have two indexes on every field I query like this, one
case-senstive, one case-insensitve (like the one you create here). Is
there a configuration option or something that will tell pgsql to do
case-insensitive comparisons (kinda like MS SQL Server has)? That could
save us on indexing overhead, since we want all of our WHERE comparisons
to be case-insensitive, anyway.

If you want all of them to be case insensitive then make the upper ( or
lower() ) index and don't make any case sensitive queries! :-)

Make sure all your queries use upper() or lower() around the field and value
you're comparing and you're golden.. Unless I've misunderstood you, I don't
see the problem..

SELECT * FROM whatever WHERE lower(myfield) = lower('myvalue'); -- and make
your index on lower(myfield)... Viola!

-Mitch

#17Bruce Momjian
bruce@momjian.us
In reply to: David Wheeler (#5)
Re: Case insensitive selects?

On Thu, 15 Feb 2001, Michael Fork wrote:

Indexes *can* and *will* be used if you create the appropiate
functional indexes, i.e:

CREATE INDEX idx_table_field_upper ON table(upper(field));

SELECT field FROM table WHERE upper(field) LIKE upper('some string');

Hmmm...I'd hate to have two indexes on every field I query like this, one
case-senstive, one case-insensitve (like the one you create here). Is
there a configuration option or something that will tell pgsql to do
case-insensitive comparisons (kinda like MS SQL Server has)? That could
save us on indexing overhead, since we want all of our WHERE comparisons
to be case-insensitive, anyway.

I was wondering if we could do case-insensitive index waking by doing
looking for CAR as:

CAR
CAr
CaR
Car
cAR
cAr
caR
car

Basically you look for CAR, then back up in the btree, to CA and look
for r instead of R. I relized the number of tests would exponentially
explode, but isn't it just like btree walking where we back up to test
the lowercase of the letter.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#18David E. Wheeler
david@kineticode.com
In reply to: Bruce Momjian (#17)
Re: Case insensitive selects?

On Fri, 16 Feb 2001, Bruce Momjian wrote:

I was wondering if we could do case-insensitive index waking by doing
looking for CAR as:

CAR
CAr
CaR
Car
cAR
cAr
caR
car

Basically you look for CAR, then back up in the btree, to CA and look
for r instead of R. I relized the number of tests would exponentially
explode, but isn't it just like btree walking where we back up to test
the lowercase of the letter.

Wouldn't it be more efficient to just have a single, case-insensitive
index, and then have the query engine automagically compare to the index
in a case-insensitive way? I'm assuming that this is the sort of approach
MS takes, which is why one has to choose the sort order at installation
time. If I choose case-insensitive Unicode, then I would expect the server
to do these things for me behind the scenes:

* When I create an index, automatically convert all char/varchar/text
fields with lower().
* When I do a query, automatically use lower() on all fields and values
queried against.

The result would be the same as Mitch describes, only I don't have to do
the work in my queries. The database would assume I want case-insensitive
matching based on some configuration I set, and do all the lower()s for
me. Perhaps the configuration could be set on a per-database basis (like
character set now is with multibyte).

Does that make sense?

Best

David

#19Bruce Momjian
bruce@momjian.us
In reply to: David E. Wheeler (#18)
Re: Case insensitive selects?

Wouldn't it be more efficient to just have a single, case-insensitive
index, and then have the query engine automagically compare to the index
in a case-insensitive way? I'm assuming that this is the sort of approach
MS takes, which is why one has to choose the sort order at installation
time. If I choose case-insensitive Unicode, then I would expect the server
to do these things for me behind the scenes:

Yes, our CREATE INDEX lower(col) already does that, but you do have to
use lower(col) when doing the query.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#20David E. Wheeler
david@kineticode.com
In reply to: Bruce Momjian (#19)
Re: Case insensitive selects?

On Fri, 16 Feb 2001, Bruce Momjian wrote:

Yes, our CREATE INDEX lower(col) already does that, but you do have to
use lower(col) when doing the query.

Right, that's what I'm suggesting a configuration that automates the
lower(col) bit in CREATE INDEX and that automates the lower(col) in
queries.

BTW, I've run into some snags with CREATE INDEX lower(col). First it
wouldn't work because my col was varchar (fixec by creating a new
function) and then because I tried to combine columns:

CREATE UNIQUE INDEX idx_name ON server(lower(col1), col2);

But I see that either they all have to be inside the function or for there
be be no function. Will 7.1 support mixing like this?

Thanks for your prompt responses, Bruce.

David

#21Michael Fork
mfork@toledolink.com
In reply to: David E. Wheeler (#14)
#22Bruce Momjian
bruce@momjian.us
In reply to: David E. Wheeler (#20)
#23David E. Wheeler
david@kineticode.com
In reply to: Michael Fork (#21)
#24David E. Wheeler
david@kineticode.com
In reply to: Bruce Momjian (#22)
#25Martijn van Oosterhout
kleptog@svana.org
In reply to: David E. Wheeler (#6)
#26David E. Wheeler
david@kineticode.com
In reply to: Martijn van Oosterhout (#25)