Set COLLATE on a session level

Started by Dirk Mikaover 5 years ago18 messagesgeneral
Jump to latest
#1Dirk Mika
Dirk.Mika@mikatiming.de

Hello,

we come from the Oracle world and we have an application that, depending on a setting, sends the command ALTER SESSION SET NLS_SORT=... when connecting to the database.

Is there a similar way to set a COLLATE for a session in PostgreSQL?

I know that I can specify a COLLATE for a SELECT statement in the ORDER BY Clause, but then I would have to adjust the statements in the client and statements that are automatically generated by the database components used, would not be affected.

Regards
Dirk

--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

## INEOS 1:59 memories
## https://www.facebook.com/mikatiming

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Dirk Mika (#1)
Re: Set COLLATE on a session level

On Fri, 2020-11-20 at 07:13 +0000, Dirk Mika wrote:

we come from the Oracle world and we have an application that, depending on a setting,
sends the command ALTER SESSION SET NLS_SORT=... when connecting to the database.

Is there a similar way to set a COLLATE for a session in PostgreSQL?

I know that I can specify a COLLATE for a SELECT statement in the ORDER BY Clause,
but then I would have to adjust the statements in the client and statements that are
automatically generated by the database components used, would not be affected.

There is no way to do that in PostgreSQL.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#3Dirk Mika
Dirk.Mika@mikatiming.de
In reply to: Laurenz Albe (#2)
Re: Set COLLATE on a session level

Thank you for the quick reply. But how is this usually solved?

Let's assume there is an app that accesses the same database from different countries. And in this app data should be displayed ordered. And the sort order is not identical in all countries.

Does the app have to send different SQL commands depending on the country? Not nice.
Do the data have to be sorted in the app? Not nice either.

Regards
Dirk

--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Dirk Mika (#3)
Re: Set COLLATE on a session level

Hi

pá 20. 11. 2020 v 15:28 odesílatel Dirk Mika <Dirk.Mika@mikatiming.de>
napsal:

Thank you for the quick reply. But how is this usually solved?

Let's assume there is an app that accesses the same database from
different countries. And in this app data should be displayed ordered. And
the sort order is not identical in all countries.

Does the app have to send different SQL commands depending on the country?
Not nice.
Do the data have to be sorted in the app? Not nice either.

the query is the same - you just use a different COLLATE clause. For
Postgres there is not any other way.

Regards

Pavel

Show quoted text

Regards
Dirk

--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

#5Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Pavel Stehule (#4)
Re: Set COLLATE on a session level

On Fri, Nov 20, 2020 at 03:32:48PM +0100, Pavel Stehule wrote:

pá 20. 11. 2020 v 15:28 odesílatel Dirk Mika <Dirk.Mika@mikatiming.de>
napsal:

Let's assume there is an app that accesses the same database from
different countries. And in this app data should be displayed ordered. And
the sort order is not identical in all countries.

Does the app have to send different SQL commands depending on the country?
Not nice.
Do the data have to be sorted in the app? Not nice either.

the query is the same - you just use a different COLLATE clause. For
Postgres there is not any other way.

One might use a function producing a SELECT taking the locale as a parameter.

Or views in schemas per locale. Selecting the search path
per locale pulls in the right view.

Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

#6Dirk Mika
Dirk.Mika@mikatiming.de
In reply to: Karsten Hilbert (#5)
Re: Set COLLATE on a session level

pá 20. 11. 2020 v 15:28 odesílatel Dirk Mika <Dirk.Mika@mikatiming.de>
napsal:

Let's assume there is an app that accesses the same database from
different countries. And in this app data should be displayed ordered. And
the sort order is not identical in all countries.

Does the app have to send different SQL commands depending on the country?
Not nice.
Do the data have to be sorted in the app? Not nice either.

the query is the same - you just use a different COLLATE clause. For
Postgres there is not any other way.

One might use a function producing a SELECT taking the locale as a parameter.

Or views in schemas per locale. Selecting the search path
per locale pulls in the right view.

Suppose the SQL statements are generated by a database layer such as Hibernate. It seems to me that it is not possible to use a function that adds the COLLATE clause.

And one view per locale would mean that I would have to create a whole schema including all views for each locale I want to support. I would have to roll out a new version of the data model, just for an additional locale.

This all seems unnecessarily complicated to me.

Dirk

--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

#7Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Dirk Mika (#6)
Re: Set COLLATE on a session level

Am Mon, Nov 30, 2020 at 10:11:38AM +0000 schrieb Dirk Mika:

Or views in schemas per locale. Selecting the search path
per locale pulls in the right view.

And one view per locale would mean that I would have to
create a whole schema including all views for each locale I
want to support. I would have to roll out a new version of
the data model, just for an additional locale.

Or a "smart" view. Set a session variable before running the
query and have the (one) view return the locale'd data based
on the session variable ...

set session "mika.current_locale" = 'locale@2_use';

and use

select current_setting('mika.current_locale')

as needed inside the view definition

This all seems unnecessarily complicated to me.

No one said it is going to be particularly convenient... You
asked for possible, I guess. :-)

Best,
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

#8Dirk Mika
Dirk.Mika@mikatiming.de
In reply to: Karsten Hilbert (#7)
Re: Set COLLATE on a session level

Or views in schemas per locale. Selecting the search path
per locale pulls in the right view.

And one view per locale would mean that I would have to
create a whole schema including all views for each locale I
want to support. I would have to roll out a new version of
the data model, just for an additional locale.

Or a "smart" view. Set a session variable before running the
query and have the (one) view return the locale'd data based
on the session variable ...

set session "mika.current_locale" = 'locale@2_use';

and use

select current_setting('mika.current_locale')

as needed inside the view definition

I'll take a look at that. Thank you very much for the idea.

This all seems unnecessarily complicated to me.

No one said it is going to be particularly convenient... You
asked for possible, I guess. :-)

My sentence was more along the lines that I cannot imagine that I am the first to have this problem and that there should therefore be an easier solution. But apparently the topic is not so relevant in general. 😉

BR
Dirk

--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Dirk Mika (#1)
Re: Set COLLATE on a session level

On 2020-11-20 08:13, Dirk Mika wrote:

we come from the Oracle world and we have an application that, depending on a setting, sends the command ALTER SESSION SET NLS_SORT=... when connecting to the database.

Is there a similar way to set a COLLATE for a session in PostgreSQL?

I know that I can specify a COLLATE for a SELECT statement in the ORDER BY Clause, but then I would have to adjust the statements in the client and statements that are automatically generated by the database components used, would not be affected.

There is a SET COLLATION command in the SQL standard that does this.
Someone just has to implement it. It wouldn't be terribly difficult, I
think.

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#9)
Re: Set COLLATE on a session level

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 2020-11-20 08:13, Dirk Mika wrote:

I know that I can specify a COLLATE for a SELECT statement in the ORDER BY Clause, but then I would have to adjust the statements in the client and statements that are automatically generated by the database components used, would not be affected.

There is a SET COLLATION command in the SQL standard that does this.
Someone just has to implement it. It wouldn't be terribly difficult, I
think.

[ squint... ] Just because it's in the standard doesn't mean it's a
good idea. It sounds like this is morally equivalent to a GUC that
changes query semantics. We have learned painfully that such behaviors
are best avoided, because they break things you didn't expect to break.

(I also wonder just exactly what such a setting would do, and how it
interacts with the existing rules for deriving collations within a
query. Does parse_collate.c go out the window the minute somebody
sets a non-empty collation setting?)

regards, tom lane

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#10)
Re: Set COLLATE on a session level

On 2020-12-04 17:18, Tom Lane wrote:

There is a SET COLLATION command in the SQL standard that does this.
Someone just has to implement it. It wouldn't be terribly difficult, I
think.

[ squint... ] Just because it's in the standard doesn't mean it's a
good idea. It sounds like this is morally equivalent to a GUC that
changes query semantics. We have learned painfully that such behaviors
are best avoided, because they break things you didn't expect to break.

I think it would be analogous to the schema search path.

#12Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Karsten Hilbert (#7)
Aw: Re: Set COLLATE on a session level

Or a "smart" view. Set a session variable before running the
query and have the (one) view return the locale'd data based
on the session variable ...

set session "mika.current_locale" = 'locale@2_use';

and use

select current_setting('mika.current_locale')

as needed inside the view definition

But the argument that follows COLLATE is an identifier, not a parameter,

ah, true enough

so it must be hardcoded in the view, just like column names or table names.
I don't see how you could use the setting to control the COLLATE clause
through a view.

The view might produce its rows by calling a function which
in turn reads the setting and dynamically constructs and exexcutes
the query needed to produce the locale-dependant rows, no ? =8-)

Convoluted ? I guess so ...

Karsten

#13Daniel Verite
daniel@manitou-mail.org
In reply to: Dirk Mika (#8)
Re: Set COLLATE on a session level

Dirk Mika wrote:

My sentence was more along the lines that I cannot imagine that I am the
first to have this problem and that there should therefore be an easier
solution. But apparently the topic is not so relevant in general. 😉

I'm not sure, but it's plausible that with Oracle, the real use case
for ALTER SESSION SET nls_sort=... is to avoid a binary sort.
There's a clear difference between binary sort and all linguistic
sorts, but few differences across linguistic sorts.
These differences tend to be subtle and ignorable by end users.
Also when storing texts from multiple languages in a database, a
session parameter picking a specific region/language will be right
for only a portion of your texts anyway.

With PostgreSQL, if the database is in UTF8 encoding with a default
collation like "en_US", or more generally language_COUNTRY,
the default sorts are already performed with the Unicode collation
algorithm (by the operating system libc) with reasonable generic
sort rules. Possibly most users would not even notice the
differences that would occur between collations if you used
per-language collations according to users preferences.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite

#14Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Daniel Verite (#13)
Aw: Re: Set COLLATE on a session level

There's [...] but few differences across linguistic sorts.
These differences tend to be subtle and ignorable by end users.

But _when_ they matter they matter a lot:

Lists of peoples' names in some not-quite expected order
are a major pain to skim over, for example.

OP is in the business of timekeeping the worklife of
people so I guess sorting might matter there.

Karsten

#15Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Peter Eisentraut (#11)
Re: Set COLLATE on a session level

On Sat, 2020-12-05 at 13:12 +0100, Peter Eisentraut wrote:

On 2020-12-04 17:18, Tom Lane wrote:

There is a SET COLLATION command in the SQL standard that does this.
Someone just has to implement it. It wouldn't be terribly difficult, I
think.

[ squint... ] Just because it's in the standard doesn't mean it's a
good idea. It sounds like this is morally equivalent to a GUC that
changes query semantics. We have learned painfully that such behaviors
are best avoided, because they break things you didn't expect to break.

I think it would be analogous to the schema search path.

Usually you notice right away if the "search_path" is wrong, because
relations won't be found.

But with a bad collation set in a session, the problems would be more
subtle. For example, if someone asks you why an index isn't used to
support sorting, you'd always have to remember to ask what collation
has been set in the session.

Yours,
Laurenz Albe

#16Dirk Mika
Dirk.Mika@mikatiming.de
In reply to: Laurenz Albe (#15)
Re: Set COLLATE on a session level

--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

Am 06.12.20, 06:15 schrieb "Laurenz Albe" <laurenz.albe@cybertec.at>:

On Sat, 2020-12-05 at 13:12 +0100, Peter Eisentraut wrote:

On 2020-12-04 17:18, Tom Lane wrote:

There is a SET COLLATION command in the SQL standard that does this.
Someone just has to implement it. It wouldn't be terribly difficult, I
think.

[ squint... ] Just because it's in the standard doesn't mean it's a
good idea. It sounds like this is morally equivalent to a GUC that
changes query semantics. We have learned painfully that such behaviors
are best avoided, because they break things you didn't expect to break.

I think it would be analogous to the schema search path.

Usually you notice right away if the "search_path" is wrong, because
relations won't be found.

But with a bad collation set in a session, the problems would be more
subtle. For example, if someone asks you why an index isn't used to
support sorting, you'd always have to remember to ask what collation
has been set in the session.

This is true, but it is already the case in Oracle. There it is usually even the case that an index on a VARCHAR2 column is not used for sorting, since the index is binary sorted, but the language setting is usually not.

The SET COLLATION command would be exactly what we're looking for. (

BR
Dirk

#17Alexandre GRAIL
a.grail@augure-ng.fr
In reply to: Dirk Mika (#16)
Re: Set COLLATE on a session level

On 10/12/2020 19:33, Dirk Mika wrote:

There is a SET COLLATION command in the SQL standard that does this.
Someone just has to implement it. It wouldn't be terribly difficult, I
think.

I think it would be analogous to the schema search path.

Usually you notice right away if the "search_path" is wrong, because
relations won't be found.

But with a bad collation set in a session, the problems would be more
subtle. For example, if someone asks you why an index isn't used to
support sorting, you'd always have to remember to ask what collation
has been set in the session.

If that information appear in the explain output, you won’'t have to ask.

For people working in multilingual environment, a collation setting
within session absolutely makes sense. So you don’t have to specify the
same setting at each and every query. Also, the expected ordering do
depends on the user who do the query (and his expectation from the
language he’s working in) and *not* the data itself.

Regards

#18Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Dirk Mika (#6)
Re: Set COLLATE on a session level

On 2020-11-30 10:11:38 +0000, Dirk Mika wrote:

pá 20. 11. 2020 v 15:28 odesílatel Dirk Mika <Dirk.Mika@mikatiming.de>
napsal:

Let's assume there is an app that accesses the same database from
different countries. And in this app data should be displayed ordered. And
the sort order is not identical in all countries.

Does the app have to send different SQL commands depending on the country?
Not nice.
Do the data have to be sorted in the app? Not nice either.

the query is the same - you just use a different COLLATE clause. For
Postgres there is not any other way.

[...]

Suppose the SQL statements are generated by a database layer such as
Hibernate. It seems to me that it is not possible to use a function
that adds the COLLATE clause.

It seems to me that this is a defect in the ORM. Sorting by current
locale rules is important for many applications, so that is something an
ORM should support. How the ORM does it (setting a session parameter,
modifying the query, ...) may be backend-specific and not something the
programmer should worry about.

That said, I don't even know if Django (the only ORM I've used in any
depth) does that.

I also agree, that logically, the collation order should be a session
parameter. It is language-specific and therefore user-specific if you
have international users. (I acknowledge the potential performance
problems, but they are the same with an explicit collation clause).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"