Allow disabling folding of unquoted identifiers to lowercase

Started by Evgeny Morozovalmost 10 years ago14 messagesgeneral
Jump to latest
#1Evgeny Morozov
evgeny.morozov+list+pgsql@shift-technology.com

It would be great if Postgres had a server setting that allowed the
automatic folding of identifiers to lowercase to be disabled, so that camel
case identifiers could be used without having to quote every single
identifier, i.e.

SELECT MyColumn FROM MyTable ORDER BY MyColumn

instead of

SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"

I understand that Postgres has a well-established naming convention and I'm
certainly not suggesting changing the default behaviour, but having this as
an option would remove a major pain point for users migrating from another
RDBMS, like MS SQL Server.

My company is looking into doing this. Currently our table and column names
exactly match our class and property names, which are in camel case. MSSQL
supports this just fine. To move to Postgres we would have to either quote
*everything* or translate names back-and-forth between code and database.
Both options are OK for auto-generated SQL, but we also have many users
writing ad-hoc SQL queries. Having to quote everything would have those
users screaming to switch back to MSSQL very quickly! That leaves us with
the mapping approach, which is doable, but also a constant "mental
speedbump" at best.

#2John R Pierce
pierce@hogranch.com
In reply to: Evgeny Morozov (#1)
Re: Allow disabling folding of unquoted identifiers to lowercase

On 4/29/2016 10:21 AM, Evgeny Morozov wrote:

It would be great if Postgres had a server setting that allowed the
automatic folding of identifiers to lowercase to be disabled, so that
camel case identifiers could be used without having to quote every
single identifier, i.e.

SELECT MyColumn FROM MyTable ORDER BY MyColumn

instead of

SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"

I suspect this would be painful for the parser, unless you also enforced
that all SQL keywords were in a specific case (all lower would be the
minimal impact to the code). otherwise the parser would have to
lower() every token to check to see if its a keyword, but if not, revert
it to its original case.

--
john r pierce, recycling bits in santa cruz

#3John McKown
john.archie.mckown@gmail.com
In reply to: John R Pierce (#2)
Re: Allow disabling folding of unquoted identifiers to lowercase

On Fri, Apr 29, 2016 at 2:44 PM, John R Pierce <pierce@hogranch.com> wrote:

On 4/29/2016 10:21 AM, Evgeny Morozov wrote:

It would be great if Postgres had a server setting that allowed the
automatic folding of identifiers to lowercase to be disabled, so that camel
case identifiers could be used without having to quote every single
identifier, i.e.

SELECT MyColumn FROM MyTable ORDER BY MyColumn

instead of

SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"

I suspect this would be painful for the parser, unless you also enforced
that all SQL keywords were in a specific case (all lower would be the
minimal impact to the code). otherwise the parser would have to lower()
every token to check to see if its a keyword, but if not, revert it to its
original case.

​Why? PostgreSQL is written in C. So use strncasecmp() instead of strncmp()
or strcasecmp() instead of strcmp() to test for a token.

--
john r pierce, recycling bits in santa cruz

--
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

#4John R Pierce
pierce@hogranch.com
In reply to: John McKown (#3)
Re: Allow disabling folding of unquoted identifiers to lowercase

On 4/29/2016 12:56 PM, John McKown wrote:

I suspect this would be painful for the parser, unless you also
enforced that all SQL keywords were in a specific case (all lower
would be the minimal impact to the code). otherwise the parser
would have to lower() every token to check to see if its a
keyword, but if not, revert it to its original case.

​ Why? PostgreSQL is written in C. So use strncasecmp() instead of
strncmp() or strcasecmp() instead of strcmp() to test for a token.

are those the APIs the parser uses?

--
john r pierce, recycling bits in santa cruz

#5John McKown
john.archie.mckown@gmail.com
In reply to: John R Pierce (#4)
Re: Allow disabling folding of unquoted identifiers to lowercase

On Fri, Apr 29, 2016 at 3:38 PM, John R Pierce <pierce@hogranch.com> wrote:

On 4/29/2016 12:56 PM, John McKown wrote:

I suspect this would be painful for the parser, unless you also enforced

that all SQL keywords were in a specific case (all lower would be the
minimal impact to the code). otherwise the parser would have to lower()
every token to check to see if its a keyword, but if not, revert it to its
original case.

​ Why? PostgreSQL is written in C. So use strncasecmp() instead of
strncmp() or strcasecmp() instead of strcmp() to test for a token.

are those the APIs the parser uses?

​Ah. Good point. I ASSuMEd that what any C language program would use.
IIRC, PostgreSQL uses Bison and Flex to generate the parsing grammars, and
I don't know what those products output.​

--
john r pierce, recycling bits in santa cruz

--
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

#6John McKown
john.archie.mckown@gmail.com
In reply to: John R Pierce (#4)
Re: Allow disabling folding of unquoted identifiers to lowercase

On Fri, Apr 29, 2016 at 3:38 PM, John R Pierce <pierce@hogranch.com> wrote:

On 4/29/2016 12:56 PM, John McKown wrote:

I suspect this would be painful for the parser, unless you also enforced

that all SQL keywords were in a specific case (all lower would be the
minimal impact to the code). otherwise the parser would have to lower()
every token to check to see if its a keyword, but if not, revert it to its
original case.

​ Why? PostgreSQL is written in C. So use strncasecmp() instead of
strncmp() or strcasecmp() instead of strcmp() to test for a token.

are those the APIs the parser uses?

​Did a quick check of the .c files in the backend/parser directory source
code. All the programs use the, oldest, strcmp() function. Too bad.​ I'm
not a C expert, nor extremely familiar with the PostgreSQL source code. But
doing some scans, I see the use of a function called "ScanKeywordLookup"
and it does a case insensitive search "the hard way". The comments indicate
that this is due the the SQL standard requiring proper use of Unicode
translation:
comment:
/*
* Apply an ASCII-only downcasing. We must not use tolower() since
it may
* produce the wrong translation in some locales (eg, Turkish).
*/

​Oh well, it's been interesting, but I don't think that we'll come to a
resolution for the OP on this issue. I just blame both PostgreSQL and MySQL
for this problem because the SQL standard says that the names are
automatically UPPERCASEd unless enclosed in quotes. Not lower cased as
PostgreSQL does it, nor unchanged as MySQL does it.​

--
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: John McKown (#6)
Re: Allow disabling folding of unquoted identifiers to lowercase

John McKown <john.archie.mckown@gmail.com> writes:

​Oh well, it's been interesting, but I don't think that we'll come to a
resolution for the OP on this issue. I just blame both PostgreSQL and MySQL
for this problem because the SQL standard says that the names are
automatically UPPERCASEd unless enclosed in quotes. Not lower cased as
PostgreSQL does it, nor unchanged as MySQL does it.​

Yeah, this isn't going to happen. Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower). It was impractical then and no doubt is far
more so now. I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it. So I doubt we'd accept such a patch
even if someone managed to create one.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Evgeny Morozov
evgeny.morozov+list+pgsql@shift-technology.com
In reply to: Tom Lane (#7)
Re: Allow disabling folding of unquoted identifiers to lowercase

On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, this isn't going to happen. Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower). It was impractical then and no doubt is far
more so now. I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

Alright, thanks to everyone for looking into this. Not knowing the code, I
naively assumed it should be easy to add an option to not do the case
folding.

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it.

Indeed, my argument was it would allow people to choose their own naming

convention to (easily) use with Postgres, which would in turn make it
easier to migrate from other RDBMSes to Postgres. Although, if you want a
standard compliance argument, that can easily be added. :) Just have 3
options: fold to lowercase (current behaviour, default), fold to uppercase
(standards compliant), do not fold (most flexible, compatible with MSSQL
and MySQL).

So I doubt we'd accept such a patch even if someone managed to create one.

Well, I was going to ask if paying someone to fix this was an option, but
this preempts that!

#9John McKown
john.archie.mckown@gmail.com
In reply to: Evgeny Morozov (#8)
Re: Allow disabling folding of unquoted identifiers to lowercase

On Mon, May 2, 2016 at 3:03 AM, Evgeny Morozov <
evgeny.morozov+list+pgsql@shift-technology.com> wrote:

On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, this isn't going to happen. Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower). It was impractical then and no doubt is far
more so now. I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

Alright, thanks to everyone for looking into this. Not knowing the code, I
naively assumed it should be easy to add an option to not do the case
folding.

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it.

Indeed, my argument was it would allow people to choose their own naming

convention to (easily) use with Postgres, which would in turn make it
easier to migrate from other RDBMSes to Postgres. Although, if you want a
standard compliance argument, that can easily be added. :) Just have 3
options: fold to lowercase (current behaviour, default), fold to uppercase
(standards compliant), do not fold (most flexible, compatible with MSSQL
and MySQL).

So I doubt we'd accept such a patch even if someone managed to create

one.

Well, I was going to ask if paying someone to fix this was an option, but
this preempts that!

​I have a silly idea. IIRC, your original problem is that your users are in
the habit of entering something like:

select ... from SomeTable ...

And MySQL would actually use the name "SomeTable" (case preserving) and not
"sometable" (PostgreSQL) or "SOMETABLE" (SQL standard). What program are
the users actually using to do the select? If it is something like "psql",
perhaps it would actually be easier to create a modified version which
automatically inserts the " marks for them. Of course, you are now doing
double parsing of the SQL. First in the client, to modify it before sending
to the server. Then again in the server. OK, maybe it is going too far. I
guess this might be a "quote everything which is not a keyword" option for
a psql replacement. Or whatever the front end is that the users use.

--
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

#10Evgeny Morozov
evgeny.morozov@shift-technology.com
In reply to: John McKown (#9)
Re: Allow disabling folding of unquoted identifiers to lowercase

That's an interesting idea! The client users would use is probably pgAdmin.
I don't know whether pgAdmin parses the query, though. If it does then it
should be relatively easy to add this. If not, I'd imagine it's not going
to happen.

On 2 May 2016 at 13:59, John McKown <john.archie.mckown@gmail.com> wrote:

Show quoted text

On Mon, May 2, 2016 at 3:03 AM, Evgeny Morozov <
evgeny.morozov+list+pgsql@shift-technology.com> wrote:

On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, this isn't going to happen. Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower). It was impractical then and no doubt is far
more so now. I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

Alright, thanks to everyone for looking into this. Not knowing the code,
I naively assumed it should be easy to add an option to not do the case
folding.

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it.

Indeed, my argument was it would allow people to choose their own naming

convention to (easily) use with Postgres, which would in turn make it
easier to migrate from other RDBMSes to Postgres. Although, if you want a
standard compliance argument, that can easily be added. :) Just have 3
options: fold to lowercase (current behaviour, default), fold to uppercase
(standards compliant), do not fold (most flexible, compatible with MSSQL
and MySQL).

So I doubt we'd accept such a patch even if someone managed to create

one.

Well, I was going to ask if paying someone to fix this was an option, but
this preempts that!

​I have a silly idea. IIRC, your original problem is that your users are
in the habit of entering something like:

select ... from SomeTable ...

And MySQL would actually use the name "SomeTable" (case preserving) and
not "sometable" (PostgreSQL) or "SOMETABLE" (SQL standard). What program
are the users actually using to do the select? If it is something like
"psql", perhaps it would actually be easier to create a modified version
which automatically inserts the " marks for them. Of course, you are now
doing double parsing of the SQL. First in the client, to modify it before
sending to the server. Then again in the server. OK, maybe it is going too
far. I guess this might be a "quote everything which is not a keyword"
option for a psql replacement. Or whatever the front end is that the users
use.

--
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

#11Guillaume Lelarge
guillaume@lelarge.info
In reply to: Evgeny Morozov (#10)
Re: Allow disabling folding of unquoted identifiers to lowercase

Le 3 mai 2016 7:01 PM, "Evgeny Morozov" <evgeny.morozov@shift-technology.com>
a écrit :

That's an interesting idea! The client users would use is probably

pgAdmin. I don't know whether pgAdmin parses the query, though. If it does
then it should be relatively easy to add this. If not, I'd imagine it's not
going to happen.

The pgAdmin query tool doesn't parse the query. It sends to the server
without parsing itself.

On 2 May 2016 at 13:59, John McKown <john.archie.mckown@gmail.com> wrote:

On Mon, May 2, 2016 at 3:03 AM, Evgeny Morozov <

evgeny.morozov+list+pgsql@shift-technology.com> wrote:

On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, this isn't going to happen. Years ago we looked into what it

would

take to be able to flip a switch and have the standard-compliant

behavior

(fold to upper not lower). It was impractical then and no doubt is far
more so now. I do not remember all the details, but there were

multiple

pain points even in terms of server-side implementation, never mind all
the applications we'd break.

Alright, thanks to everyone for looking into this. Not knowing the

code, I naively assumed it should be easy to add an option to not do the
case folding.

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it.

Indeed, my argument was it would allow people to choose their own

naming convention to (easily) use with Postgres, which would in turn make
it easier to migrate from other RDBMSes to Postgres. Although, if you want
a standard compliance argument, that can easily be added. :) Just have 3
options: fold to lowercase (current behaviour, default), fold to uppercase
(standards compliant), do not fold (most flexible, compatible with MSSQL
and MySQL).

So I doubt we'd accept such a patch even if someone managed to create

one.

Well, I was going to ask if paying someone to fix this was an option,

but this preempts that!

​I have a silly idea. IIRC, your original problem is that your users are

in the habit of entering something like:

select ... from SomeTable ...

And MySQL would actually use the name "SomeTable" (case preserving) and

not "sometable" (PostgreSQL) or "SOMETABLE" (SQL standard). What program
are the users actually using to do the select? If it is something like
"psql", perhaps it would actually be easier to create a modified version
which automatically inserts the " marks for them. Of course, you are now
doing double parsing of the SQL. First in the client, to modify it before
sending to the server. Then again in the server. OK, maybe it is going too
far. I guess this might be a "quote everything which is not a keyword"
option for a psql replacement. Or whatever the front end is that the users
use.

--
The unfacts, did we have them, are too imprecisely few to warrant our

certitude.

Show quoted text

Maranatha! <><
John McKown

#12Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Evgeny Morozov (#1)
Re: Allow disabling folding of unquoted identifiers to lowercase

On 2016-04-29 19:21:30 +0200, Evgeny Morozov wrote:

It would be great if Postgres had a server setting that allowed the automatic
folding of identifiers to lowercase to be disabled, so that camel case
identifiers could be used without having to quote every single identifier, i.e.

SELECT MyColumn FROM MyTable ORDER BY MyColumn

instead of

SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"

[...]

My company is looking into doing this. Currently our table and column names
exactly match our class and property names, which are in camel case. MSSQL
supports this just fine. To move to Postgres we would have to either quote
*everything* or translate names back-and-forth between code and database. Both
options are OK for auto-generated SQL, but we also have many users writing
ad-hoc SQL queries. Having to quote everything would have those users screaming
to switch back to MSSQL very quickly! That leaves us with the mapping approach,
which is doable, but also a constant "mental speedbump" at best.

What exactly is the problem you are trying to solve?

If you and your users are consistent about never using quotes, your
users can write:

SELECT MyColumn FROM MyTable ORDER BY MyColumn;

It will select mycolumn from mytable, but that doesn't matter, since you
created the table with

CREATE MyTable (MyColumn varchar);

so you really have a table mytable with a column mycolumn, not a table
MyTable with a column MyColumn.

There are three potential problems I can see:

1) Users might be confused that PgAdmin (or whatever tool they use to
inspect the database) displays all the names in lowercase, and they
might find a name like sometablewithareallylongname less readable
than SomeTableWithAReallyLongName.

2) Since case doesn't matter, they might be inconsistent: One programmer
might write MyTable, another mytable, the third MYTABLE, ...

3) You might want to use a tool to automatically generate SQL queries,
but that tool quotes identifiers.

hp

--
_ | Peter J. Holzer | I want to forget all about both belts and
|_|_) | | suspenders; instead, I want to buy pants
| | | hjp@hjp.at | that actually fit.
__/ | http://www.hjp.at/ | -- http://noncombatant.org/

#13Klaus P. Pieper
kpi6288@gmail.com
In reply to: Peter J. Holzer (#12)
Re: Allow disabling folding of unquoted identifiers to lowercase

-----Ursprüngliche Nachricht-----

What exactly is the problem you are trying to solve?

If you and your users are consistent about never using quotes, your users

can

write:

SELECT MyColumn FROM MyTable ORDER BY MyColumn;

It will select mycolumn from mytable, but that doesn't matter, since you
created the table with

CREATE MyTable (MyColumn varchar);

so you really have a table mytable with a column mycolumn, not a table
MyTable with a column MyColumn.

I use 50% of my time Sybase and 50% PostgreSQL.

For me is the way Sybase works is just more convenient:

CREATE MyTable (MyColumn varchar);

creates a camel cased table MyType and field MyColumn.

SELECT * FROM SYSCATALOG gives MyTable.

This is better readable when you use long table / fiel names.

The user can then use whatever he wants: mytable, MyTable, myTABLE, ...

Klaus

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

In reply to: Klaus P. Pieper (#13)
Re: Allow disabling folding of unquoted identifiers to lowercase

On 08/05/2016 10:41, Klaus P. Pieper wrote:

For me is the way Sybase works is just more convenient:

CREATE MyTable (MyColumn varchar);

creates a camel cased table MyType and field MyColumn.

SELECT * FROM SYSCATALOG gives MyTable.

This is better readable when you use long table / fiel names.

I reckon this is just a matter of style... I use underscores to make
long names readable...

create table my_table_with_a_really_long_name;

....and it works fine.

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general