SQL Standards Compliance With Case

Started by Rich Shepardalmost 20 years ago8 messagesgeneral
Jump to latest
#1Rich Shepard
rshepard@appl-ecosys.com

I'm trying to assist the XRMS developers port their application to
postgres (8.1.x on), and it's almost there. One (perhaps the only) stumbling
block is case for table and column (relation and attribute) names.
Apparently MySQL allows for mixed case, while postgres wants only lower
case. One of the development team asked me to enquire when postgres would be
fully compliant with the SQL standard in this reqard. So I'm asking. Not
challenging, not complaining, but asking to learn something about case and
the SQL standard as implemented in postgres.

While I would prefer to not read the latest SQL standard specification,
I'd like to help resolve the last six errors when I try to install XRMS on
my postgres-8.1.4 system.

Here's what the install.php script returns:

Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "group_id" already exists
I tried to execute:
CREATE INDEX Group_id ON GroupUser (Group_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "role_id" already exists
I tried to execute:
CREATE INDEX Role_id ON RolePermission (Role_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "company_id" already exists
I tried to execute:
CREATE INDEX company_id ON company_former_names (company_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "contact_id" already exists
I tried to execute:
CREATE INDEX contact_id ON contact_former_companies (contact_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "company_id" already exists
I tried to execute:
CREATE INDEX company_id ON contacts (company_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "province" already exists
I tried to execute:
CREATE INDEX province ON time_zones (province)

The first two appear to me to be case related, but I don't understand the
last four at all.

I'll be quite appreciative of learning what the SQL standard has to say
about case, where postgres differs, and what I should look for in the php
scripts' SQL statements to resolve these errors.

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc.(TM) | Accelerator
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Rich Shepard (#1)
Re: SQL Standards Compliance With Case

On Wednesday 12 July 2006 17:33, Rich Shepard wrote:

I'm trying to assist the XRMS developers port their application to
postgres (8.1.x on), and it's almost there. One (perhaps the only)
stumbling block is case for table and column (relation and attribute)
names. Apparently MySQL allows for mixed case, while postgres wants only
lower case. One of the development team asked me to enquire when postgres
would be fully compliant with the SQL standard in this reqard. So I'm
asking. Not challenging, not complaining, but asking to learn something
about case and the SQL standard as implemented in postgres.

Likley, not ever ;)...

While I would prefer to not read the latest SQL standard specification,
I'd like to help resolve the last six errors when I try to install XRMS on
my postgres-8.1.4 system.

When I look below though...

Here's what the install.php script returns:

Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "group_id" already exists
I tried to execute:
CREATE INDEX Group_id ON GroupUser (Group_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.

Something is wrong, that should work.. postgres will just fold the case..

postgres=# create table Foo (id bigserial primary key, Group_id integer);
NOTICE: CREATE TABLE will create implicit sequence "foo_id_seq" for serial
column "foo.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for
table "foo"
CREATE TABLE
postgres=# CREATE INDEX Group_id ON Foo (Group_id);
CREATE INDEX

See it works :)

Something is wrong somewhere else... for example the below:

ERROR: relation "role_id" already exists

AHA! I see what is happening... you can't have an index name the same as a
table name. Tell them just to change the index names... for example:

CREATE INDEX Group_id_idx ON Foo (Group_id);

Sincerely,

Joshua D. Drake

I tried to execute:
CREATE INDEX Role_id ON RolePermission (Role_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "company_id" already exists
I tried to execute:
CREATE INDEX company_id ON company_former_names (company_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "contact_id" already exists
I tried to execute:
CREATE INDEX contact_id ON contact_former_companies (contact_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "company_id" already exists
I tried to execute:
CREATE INDEX company_id ON contacts (company_id)
Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "province" already exists
I tried to execute:
CREATE INDEX province ON time_zones (province)

The first two appear to me to be case related, but I don't understand
the last four at all.

I'll be quite appreciative of learning what the SQL standard has to say
about case, where postgres differs, and what I should look for in the php
scripts' SQL statements to resolve these errors.

Rich

--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#3Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Rich Shepard (#1)
Re: SQL Standards Compliance With Case

On Wed, 12 Jul 2006, Rich Shepard wrote:

I'm trying to assist the XRMS developers port their application to
postgres (8.1.x on), and it's almost there. One (perhaps the only) stumbling
block is case for table and column (relation and attribute) names.
Apparently MySQL allows for mixed case, while postgres wants only lower
case. One of the development team asked me to enquire when postgres would be
fully compliant with the SQL standard in this reqard. So I'm asking. Not
challenging, not complaining, but asking to learn something about case and
the SQL standard as implemented in postgres.

AFAIK, SQL says that an non-quoted identifier such as Foo is treated as
FOO (case-folded to upper). PostgreSQL currently treats it as foo
(case-folded to lower). Quoted identifiers are not case-folded and are
compared case-sensitive.

So, for example my understanding of spec would say:

create table Foo(a numeric(10,3));
create table FoO(b numeric(11,4));
-- invalid because this is the same table name as the first create
create table "foo"(a numeric(12,5));
-- valid in SQL, invalid in PostgreSQL
create table "Foo"(a numeric(13,6));
-- valid, that's actually the mixedcase table Foo rather than FOO

or:
create table "Foo"(a numeric(10,3));
select * from Foo -- invalid, that's FOO not Foo
select * from "Foo" -- valid

create table Foo(a numeric(10,3)); -- folded to FOO
select * from foo -- valid
select * from "foo" -- invalid

While I would prefer to not read the latest SQL standard specification,
I'd like to help resolve the last six errors when I try to install XRMS on
my postgres-8.1.4 system.

Here's what the install.php script returns:

Unable to execute your query. Please correct this error.
You may need to update your database structure.
ERROR: relation "group_id" already exists
I tried to execute:
CREATE INDEX Group_id ON GroupUser (Group_id)

These seem to be complaining that there's already a table, view, index,
etc with that name already. Is there one being created with a different
case that's assuming that it'll preserve case rather than fold?

#4Rich Shepard
rshepard@appl-ecosys.com
In reply to: Joshua D. Drake (#2)
Re: SQL Standards Compliance With Case

On Wed, 12 Jul 2006, Joshua D. Drake wrote:

Likley, not ever ;)...

Hi, Josh!

That's fine with me. As long as postgres works, I'm happy.

ERROR: relation "role_id" already exists

AHA! I see what is happening... you can't have an index name the same as a
table name. Tell them just to change the index names... for example:

CREATE INDEX Group_id_idx ON Foo (Group_id);

Aha! I didn't see that. Even I know to add _idx to the table name ... when
I do it myself.

Thank you very much,

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc.(TM) | Accelerator
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#5Rich Shepard
rshepard@appl-ecosys.com
In reply to: Stephan Szabo (#3)
Re: SQL Standards Compliance With Casexx

On Wed, 12 Jul 2006, Stephan Szabo wrote:

AFAIK, SQL says that an non-quoted identifier such as Foo is treated as
FOO (case-folded to upper). PostgreSQL currently treats it as foo
(case-folded to lower). Quoted identifiers are not case-folded and are
compared case-sensitive.

So, for example my understanding of spec would say:

Thank you very much for the complete explanation, Stephan.

These seem to be complaining that there's already a table, view, index,
etc with that name already. Is there one being created with a different
case that's assuming that it'll preserve case rather than fold?

That's what Josh pointed out. I totally missed that.

Many thanks,

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc.(TM) | Accelerator
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rich Shepard (#1)
Re: SQL Standards Compliance With Case

Rich Shepard <rshepard@appl-ecosys.com> writes:

I'm trying to assist the XRMS developers port their application to
postgres (8.1.x on), and it's almost there. One (perhaps the only) stumbling
block is case for table and column (relation and attribute) names.
Apparently MySQL allows for mixed case, while postgres wants only lower
case. One of the development team asked me to enquire when postgres would be
fully compliant with the SQL standard in this reqard.

It's probably worth pointing out here that the MySQL behavior they seem
to be expecting is considerably further from the spec than Postgres's
behavior. If I'm reading between the lines correctly, they are
expecting foo and Foo (both written without double-quotes) to be
distinct identifiers. But these are the same identifier per spec,
because the spec *requires* case-folding of unquoted identifiers.

The difference between what PG does and what the spec says can only be
observed when you sometimes double-quote a particular identifier and
sometimes don't. PG makes FOO, Foo, foo and "foo" the same, but
different from "FOO" or "Foo"; while the spec would have FOO, Foo, foo
and "FOO" the same but different from "foo" or "Foo".

The relevant bits from SQL92 section 5.2 are

10)The <identifier body> of a <regular identifier> is equivalent
to an <identifier body> in which every letter that is a lower-
case letter is replaced by the equivalent upper-case letter
or letters. This treatment includes determination of equiva-
lence, representation in the Information and Definition Schemas,
representation in the diagnostics area, and similar uses.

12)Two <regular identifier>s are equivalent if their <identifier
body>s, considered as the repetition of a <character string
literal> that specifies a <character set specification> of
SQL_TEXT, compare equally according to the comparison rules
in Subclause 8.2, "<comparison predicate>".

13)A <regular identifier> and a <delimited identifier> are equiva-
lent if the <identifier body> of the <regular identifier> (with
every letter that is a lower-case letter replaced by the equiva-
lent upper-case letter or letters) and the <delimited identifier
body> of the <delimited identifier> (with all occurrences of
<quote> replaced by <quote symbol> and all occurrences of <dou-
blequote symbol> replaced by <double quote>), considered as
the repetition of a <character string literal> that specifies a
<character set specification> of SQL_TEXT and an implementation-
defined collation that is sensitive to case, compare equally
according to the comparison rules in Subclause 8.2, "<comparison
predicate>".

14)Two <delimited identifier>s are equivalent if their <delimited
identifier body>s (with all occurrences of <quote> replaced
by <quote symbol> and all occurrences of <doublequote symbol>
replaced by <doublequote>), considered as the repetition of a
<character string literal> that specifies a <character set spec-
ification> of SQL_TEXT and an implementation-defined collation
that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, "<comparison predicate>".

regards, tom lane

#7Rich Shepard
rshepard@appl-ecosys.com
In reply to: Tom Lane (#6)
Re: SQL Standards Compliance With Case

On Wed, 12 Jul 2006, Tom Lane wrote:

It's probably worth pointing out here that the MySQL behavior they seem to
be expecting is considerably further from the spec than Postgres's
behavior. If I'm reading between the lines correctly, they are expecting
foo and Foo (both written without double-quotes) to be distinct
identifiers. But these are the same identifier per spec, because the spec
*requires* case-folding of unquoted identifiers.

Thanks very much, Tom.

It turns out that the problem was index names. I tried to solve that, but
being unfamiliar with the entire application structure, I ended up defining
the same index name more than once. Sigh.

I do appreciate the insight about case. I've always used only lower case
for table and field names, even back in the DOS days when I was writing
database applications in C.

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc.(TM) | Accelerator
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#8Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Tom Lane (#6)
Re: SQL Standards Compliance With Case

On Wed, 2006-07-12 at 22:26, Tom Lane wrote:

Rich Shepard <rshepard@appl-ecosys.com> writes:

I'm trying to assist the XRMS developers port their application to
postgres (8.1.x on), and it's almost there. One (perhaps the only) stumbling
block is case for table and column (relation and attribute) names.
Apparently MySQL allows for mixed case, while postgres wants only lower
case. One of the development team asked me to enquire when postgres would be
fully compliant with the SQL standard in this reqard.

It's probably worth pointing out here that the MySQL behavior they seem
to be expecting is considerably further from the spec than Postgres's
behavior. If I'm reading between the lines correctly, they are
expecting foo and Foo (both written without double-quotes) to be
distinct identifiers. But these are the same identifier per spec,
because the spec *requires* case-folding of unquoted identifiers.

It's even worse than that. MySQL uses filenames to identify tables,
like PostgreSQL did back in the days when dinosaurs roamed the plains
and the British Police force drove sexy cars (ok, not that far back, but
anyway)...

This means that when installed in unix, table Foo and table foo are
unique and pretty flowers, but when installed on Windows, they are the
same name...

My recommendation is to either stick to one case, all the time, or to
quote, all the time. I prefer to just stick to one case all the time.