Case Sensitivity

Started by Phillip Smithover 15 years ago7 messagesgeneral
Jump to latest
#1Phillip Smith
phillip@softworks.com

Hi,
I'm moving over from MS SQL. I've been googling this for ages now and suprisingly cannot find a clear answer.

I want my data in tables to be case insensitive.

This is so i can:
1. Put keys on natural key fields, for instance a product part number. I product number 'ABC123' inserted i need to disallow 'abc123' to be inserted as a second row. Please don't tell me i have to add another column holding a lowered version of the product number. My database is littered with this need and i would end up bloating my table schema.

2. I need to query case insensitively. SELECT * FROM product WHERE product_number = 'ABC123' should return the same row as SELECT * FROM product WHERE product_number = 'abc123'

Is there a database wide collation setting i can make, or a case insensitive character type. There are lots of online posts regarding using LOWER function for querying. This is a workaround for point (2) but does not remedy point (1) above.

Many thanks

Phillip

Phillip Smith

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Phillip Smith (#1)
Re: Case Sensitivity

Hello

you should to use a citext datatype

http://www.postgresql.org/docs/9.0/interactive/citext.html

regards

Pavel Stehule

2011/1/13 Phillip Smith <phillip@softworks.com>:

Show quoted text

Hi,
I'm moving over from MS SQL. I've been googling this for ages now and
suprisingly cannot find a clear answer.

I want my data in tables to be case insensitive.

This is so i can:
1. Put keys on natural key fields, for instance a product part number. I
product number 'ABC123' inserted i need to disallow 'abc123' to be inserted
as a second row. Please don't tell me i have to add another column holding a
lowered version of the product number. My database is littered with this
need and i would end up bloating my table schema.

2.  I need to query case insensitively. SELECT * FROM product WHERE
product_number = 'ABC123' should return the same row as SELECT * FROM
product WHERE product_number = 'abc123'

Is there a database wide collation setting i can make, or a case
insensitive character type. There are lots of online posts regarding using
LOWER function for querying. This is a workaround for point (2) but does not
remedy point (1) above.

Many thanks

Phillip

Phillip Smith

#3Phillip Smith
phillip@softworks.com
In reply to: Phillip Smith (#1)
Re: Case Sensitivity

Thanks Pavel,

Is this the only way?. I would prefer to use a collation setting if this is possible. Do you know whether there would be a performance hit using non standard character strings?

Phillip

#4Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Pavel Stehule (#2)
Re: Case Sensitivity

you should to use a citext datatype

http://www.postgresql.org/docs/9.0/interactive/citext.html

Or:

- use a functional index with lower() to ensure uniqueness
- use a BEFORE trigger to lower() the inserted data

Karsten

--
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

#5Phillip Smith
phillip@softworks.com
In reply to: Phillip Smith (#1)
Re: Case Sensitivity

Hi Karsten,

modifying user inputted data is not an option for me. I need to maintain the original data. I had read about indexing on a lower function. This all seems a bit of a bodgy workaround to me. A clean Case Insensitive collation setting is cleaner. Is this supported in postgreSQL 9?

Regards

Phillip

Phillip

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Phillip Smith (#3)
Re: Case Sensitivity

2011/1/13 Phillip Smith <phillip@softworks.com>:

Thanks Pavel,

Is this the only way?. I would prefer to use a collation setting if this is
possible. Do you know whether there would be a performance hit using non
standard character strings?

It is preferred way. PostgreSQL doesn't support a collations, and what
I know collation in 9.1 will not allow a comparation between case
sensitive and case insensitive text, because collation in PostgreSQL
is based on unix locale, and there are not case insensitive locale.

I don't know about performance problems. It does same work like case
insensitive collation.

Regards

Pavel Stehule

Show quoted text

Phillip

#7Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Phillip Smith (#5)
Re: Case Sensitivity

On Thu, Jan 13, 2011 at 03:29:03PM -0000, Phillip Smith wrote:

modifying user inputted data is not an option for me. I need to
maintain the original data. I had read about indexing on a lower
function. This all seems a bit of a bodgy workaround to me. A clean
Case Insensitive collation setting is cleaner. Is this supported in
postgreSQL 9?

There isn't such a collation setting as far as I know, and it's
actually not as clean as you think. It turns out to work reasonably
well in ASCII-land, but not very well in other circumstances. In
general, the case-insensitive but case-preserving technique that
English-speaking computer users have come to find normal is a bad fit
for a wide variety of languages. (Even "Latin" characterset-using
languages have trouble, because of historic ways of handling accents.
Is the capital version of é E or É? Well, both, it turns out,
depending on whom you believe. I could bore you about the effects of
this in the DNS all day long, but I'll resist the temptation.)

Putting an index on lower(column) and then doing all your comparisons
with lower(datum) works fairly well, and I don't see how it's any more
bodgy than a database-wide case insensitive collation. For instance,
I can assure you that customers named Leblanc and LeBlanc care about
whether those two compare equally. In your customer name field, if
you have a database-wide collation setting, you can't make the
distinction.

A

--
Andrew Sullivan
ajs@crankycanuck.ca