pg_dump error - LOCALIZATION PROBLEM

Started by Erol Özover 24 years ago8 messages
#1Erol Öz
eroloz@esg.com.tr

Hi,
I think Tom Lane is right as always. My postgresql server was configured with --enable-locale option and it works perfect with Turkish stuff. However I could not find a solution to the problem below.
Any hint?
Thanks and Regards
Erol

<eroloz@esg.com.tr> writes:

Show quoted text

I get an error when the following command executed;
/usr/local/pgsql/bin/pg_dump trollandtoad > trollandtoad.out

SET TRANSACTION command failed. Explanation from backend: 'ERROR: Bad TRAN=
SACTION ISOLATION LEVEL (serializable)

Hmm. It would seem that strcasecmp() on your platform reports that the
strings "SERIALIZABLE" and "serializable" are not equal. A locale
problem perhaps?

regards, tom lane

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Erol Öz (#1)
Re: pg_dump error - LOCALIZATION PROBLEM

Erol �z writes:

I think Tom Lane is right as always. My postgresql server was
configured with --enable-locale option and it works perfect with
Turkish stuff. However I could not find a solution to the problem
below.

Untested, but try this:

Edit src/backend/commands/variable.c, look for the function
parse_XactIsoLevel(). Change the code that looks like this:

if (strcasecmp(value, "SERIALIZABLE") == 0)
XactIsoLevel = XACT_SERIALIZABLE;
else if (strcasecmp(value, "COMMITTED") == 0)
XactIsoLevel = XACT_READ_COMMITTED;

into:

if (strcmp(value, "serializable") == 0)
XactIsoLevel = XACT_SERIALIZABLE;
else if (strcmp(value, "committed") == 0)
XactIsoLevel = XACT_READ_COMMITTED;

Recompile and install.

<eroloz@esg.com.tr> writes:

I get an error when the following command executed;
/usr/local/pgsql/bin/pg_dump trollandtoad > trollandtoad.out

SET TRANSACTION command failed. Explanation from backend: 'ERROR: Bad TRAN=
SACTION ISOLATION LEVEL (serializable)

Hmm. It would seem that strcasecmp() on your platform reports that the
strings "SERIALIZABLE" and "serializable" are not equal. A locale
problem perhaps?

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#2)
Re: pg_dump error - LOCALIZATION PROBLEM

Peter Eisentraut <peter_e@gmx.net> writes:

Untested, but try this:

Edit src/backend/commands/variable.c, look for the function
parse_XactIsoLevel(). Change the code that looks like this:

if (strcasecmp(value, "SERIALIZABLE") == 0)
XactIsoLevel = XACT_SERIALIZABLE;
else if (strcasecmp(value, "COMMITTED") == 0)
XactIsoLevel = XACT_READ_COMMITTED;

into:

if (strcmp(value, "serializable") == 0)
XactIsoLevel = XACT_SERIALIZABLE;
else if (strcmp(value, "committed") == 0)
XactIsoLevel = XACT_READ_COMMITTED;

Hmm. Given that we expect the lexer to have downcased any unquoted
words, this seems like a workable solution --- where else are we using
strcasecmp() unnecessarily?

regards, tom lane

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#3)
Re: pg_dump error - LOCALIZATION PROBLEM

Tom Lane writes:

Hmm. Given that we expect the lexer to have downcased any unquoted
words, this seems like a workable solution --- where else are we using
strcasecmp() unnecessarily?

I've identified several other such places. However, in reality we have to
consider every single strcasecmp() call suspicious. In many places an
ASCII-only alternative is needed or the code needs to be rewritten.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: pg_dump error - LOCALIZATION PROBLEM

Hmm. Given that we expect the lexer to have downcased any unquoted
words, this seems like a workable solution --- where else are we using
strcasecmp() unnecessarily?

Wait a minute --- I spoke too quickly. The lexer's behavior is to
downcase unquoted identifiers in a *locale sensitive* fashion --- it
uses isupper() and tolower(). We concluded that that was correct for
identifiers according to SQL99, whereas keyword matching should not be
locale-dependent. See the comments for ScanKeywordLookup.

I've identified several other such places. However, in reality we have to
consider every single strcasecmp() call suspicious. In many places an
ASCII-only alternative is needed or the code needs to be rewritten.

I think our problems are worse than that: once the identifier has been
through a locale-dependent case conversion we really have a problem
matching it to an ASCII string. The only real solution may be to
require *all* keywords to be matched in the lexer, and forbid strcmp()
matching in later phases entirely.

regards, tom lane

#6Burak Bilen
bilen@metu.edu.tr
In reply to: Tom Lane (#3)
Re: pg_dump error - LOCALIZATION PROBLEM

hi,

I have also seen the same problem. But there is another problem related
with locale.
The function MIN is translated into m�n ( in Turkish locale support) and
postgres gives an
error message as follows:
Function 'm�n(int8)' does not exist .

But when I use "LIKE" , postgres does the operations correctly. I don't
know the internals of postgres,
but I want to solve this problem somehow?
Thanks in advance.

Tom Lane wrote:

Show quoted text

Peter Eisentraut <peter_e@gmx.net> writes:

Untested, but try this:

Edit src/backend/commands/variable.c, look for the function
parse_XactIsoLevel(). Change the code that looks like this:

if (strcasecmp(value, "SERIALIZABLE") == 0)
XactIsoLevel = XACT_SERIALIZABLE;
else if (strcasecmp(value, "COMMITTED") == 0)
XactIsoLevel = XACT_READ_COMMITTED;

into:

if (strcmp(value, "serializable") == 0)
XactIsoLevel = XACT_SERIALIZABLE;
else if (strcmp(value, "committed") == 0)
XactIsoLevel = XACT_READ_COMMITTED;

Hmm. Given that we expect the lexer to have downcased any unquoted
words, this seems like a workable solution --- where else are we using
strcasecmp() unnecessarily?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#5)
Re: [GENERAL] pg_dump error - LOCALIZATION PROBLEM

Tom Lane writes:

I think our problems are worse than that: once the identifier has been
through a locale-dependent case conversion we really have a problem
matching it to an ASCII string. The only real solution may be to
require *all* keywords to be matched in the lexer, and forbid strcmp()
matching in later phases entirely.

There are several classes of strcasecmp() misuse:

1. Using strcasecmp() on strings that are guaranteed to be lower case,
because the parser has assigned to the variable one of a finite set of
literal strings. See CREATE SEQUENCE, commands/sequence.c for example.

2. Using strcasecmp() on strings that were parsed as keywords. See CREATE
OPERATOR, CREATE AGGREGATE, CREATE TYPE, commands/define.c.

3. Using strcasecmp() on the values of GUC variables.

4. Using strcasecmp() for parsing configuration files or other things with
separate syntax rules. See libpq/hba.c for reading the recode table.

For #1, strcasecmp is just a waste.

For #2, we should export parts of ScanKeywordLookup as a generic function,
perhaps "normalize_identifier", and then we can replace

strcasecmp(var, "expected_value")

with

strcmp(normalize_identifier(var), "expected_value")

For #3, it's not quite clear, because the string value could have been
created by an identifier or a string constant, so it's either #2 or #4.

For #4, we need some ASCII-only strcasecmp version.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#7)
Re: [GENERAL] pg_dump error - LOCALIZATION PROBLEM

Peter Eisentraut <peter_e@gmx.net> writes:

2. Using strcasecmp() on strings that were parsed as keywords. See CREATE
OPERATOR, CREATE AGGREGATE, CREATE TYPE, commands/define.c.

But the real point is that they were parsed as identifiers, *not*
keywords, and therefore have already been through a locale-dependent
case conversion. (Look at what happens in scan.l after
ScanKeywordLookup fails.) Unless we can undo or short-circuit that,
it won't help to apply a correct ASCII-only comparison.

Possibly we should change the parser's Ident node type to carry both the
raw string and the downcased-as-identifier string. The latter would
serve the existing needs, the former could be used for keyword matching.

For #2, we should export parts of ScanKeywordLookup as a generic function,
perhaps "normalize_identifier", ...
For #4, we need some ASCII-only strcasecmp version.

I think these are the same thing.

regards, tom lane