(patch) regression diffs on collate.linux.utf8 test

Started by Jeff Davisabout 14 years ago13 messages
#1Jeff Davis
pgsql@j-davis.com
1 attachment(s)

On master, I see a minor test error (at least on my machine) as well as
a diff. Patch attached.

Regards,
Jeff Davis

Attachments:

collate.linux.utf8.patchtext/x-patch; charset=UTF-8; name=collate.linux.utf8.patchDownload
*** a/src/test/regress/expected/collate.linux.utf8.out
--- b/src/test/regress/expected/collate.linux.utf8.out
***************
*** 395,401 **** SELECT relname FROM pg_class WHERE relname ~* '^abc';
  (0 rows)
  
  -- to_char
! SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
     to_char   
  -------------
--- 395,401 ----
  (0 rows)
  
  -- to_char
! SET lc_time TO 'tr_TR.UTF-8';
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
     to_char   
  -------------
***************
*** 967,972 **** CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
--- 967,973 ----
  ERROR:  parameter "lc_ctype" must be specified
  CREATE COLLATION testx (locale = 'nonsense'); -- fail
  ERROR:  could not create locale "nonsense": No such file or directory
+ DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
  CREATE COLLATION test4 FROM nonsense;
  ERROR:  collation "nonsense" for encoding "UTF8" does not exist
  CREATE COLLATION test5 FROM test0;
*** a/src/test/regress/sql/collate.linux.utf8.sql
--- b/src/test/regress/sql/collate.linux.utf8.sql
***************
*** 146,152 **** SELECT relname FROM pg_class WHERE relname ~* '^abc';
  
  -- to_char
  
! SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
  
--- 146,152 ----
  
  -- to_char
  
! SET lc_time TO 'tr_TR.UTF-8';
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
  
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#1)
Re: (patch) regression diffs on collate.linux.utf8 test

Jeff Davis <pgsql@j-davis.com> writes:

On master, I see a minor test error (at least on my machine) as well as
a diff. Patch attached.

Hmm, yeah, I forgot to fix this regression test when I added that DETAIL
line. However, I don't see the need for fooling with the lc_time value?

regards, tom lane

#3Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#2)
1 attachment(s)
Re: (patch) regression diffs on collate.linux.utf8 test

On Sun, 2011-10-16 at 16:00 -0400, Tom Lane wrote:

Jeff Davis <pgsql@j-davis.com> writes:

On master, I see a minor test error (at least on my machine) as well as
a diff. Patch attached.

Hmm, yeah, I forgot to fix this regression test when I added that DETAIL
line. However, I don't see the need for fooling with the lc_time value?

regards, tom lane

Here is the diff that I'm seeing on master right now with:

make -s check EXTRA_TESTS=collate.linux.utf8

If I qualify it as "tr_TR.UTF-8" it works. Perhaps I have something
misconfigured on my system (Ubuntu 11.10)? I just installed:
language-pack-de
language-pack-tr
language-pack-sv

in an attempt to make the test work, and it works all except for that
lc_time settng.

Regards,
Jeff Davis

Attachments:

regression.diffstext/x-patch; charset=UTF-8; name=regression.diffsDownload
*** /home/jdavis/wd/git/postgresql/src/test/regress/expected/collate.linux.utf8.out	2011-10-18 00:47:06.817223853 -0700
--- /home/jdavis/wd/git/postgresql/src/test/regress/results/collate.linux.utf8.out	2011-10-18 01:02:06.509206748 -0700
***************
*** 396,411 ****
  
  -- to_char
  SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
     to_char   
  -------------
!  01 NIS 2010
  (1 row)
  
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
     to_char   
  -------------
!  01 NİS 2010
  (1 row)
  
  -- backwards parsing
--- 396,412 ----
  
  -- to_char
  SET lc_time TO 'tr_TR';
+ ERROR:  invalid value for parameter "lc_time": "tr_TR"
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
     to_char   
  -------------
!  01 APR 2010
  (1 row)
  
  SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
     to_char   
  -------------
!  01 APR 2010
  (1 row)
  
  -- backwards parsing

======================================================================

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#3)
Re: (patch) regression diffs on collate.linux.utf8 test

Jeff Davis <pgsql@j-davis.com> writes:

On Sun, 2011-10-16 at 16:00 -0400, Tom Lane wrote:

Hmm, yeah, I forgot to fix this regression test when I added that DETAIL
line. However, I don't see the need for fooling with the lc_time value?

Here is the diff that I'm seeing on master right now with:
make -s check EXTRA_TESTS=collate.linux.utf8

If I qualify it as "tr_TR.UTF-8" it works. Perhaps I have something
misconfigured on my system (Ubuntu 11.10)? I just installed:
language-pack-de
language-pack-tr
language-pack-sv

That's very strange. It works as-is for me on Fedora 13, 14, and 15,
which you'd expect to have essentially the same I18N infrastructure as
Ubuntu (and I'm pretty sure I didn't install any optional language
support for Turkish). Anybody know what the problem is?

The reason I'm resisting just changing it is that I'd prefer to minimize
the number of dependencies this regression test has on the exact
spelling of "UTF-8", as that is not terribly well standardized. On my
Fedora boxes, for instance, locale -a says that "tr_TR.utf8" is the
name of that particular locale.

regards, tom lane

#5Peter Eisentraut
peter_e@gmx.net
In reply to: Jeff Davis (#3)
Re: (patch) regression diffs on collate.linux.utf8 test

On tis, 2011-10-18 at 01:07 -0700, Jeff Davis wrote:

On Sun, 2011-10-16 at 16:00 -0400, Tom Lane wrote:

Jeff Davis <pgsql@j-davis.com> writes:

On master, I see a minor test error (at least on my machine) as well as
a diff. Patch attached.

Hmm, yeah, I forgot to fix this regression test when I added that DETAIL
line. However, I don't see the need for fooling with the lc_time value?

regards, tom lane

Here is the diff that I'm seeing on master right now with:

make -s check EXTRA_TESTS=collate.linux.utf8

If I qualify it as "tr_TR.UTF-8" it works. Perhaps I have something
misconfigured on my system (Ubuntu 11.10)? I just installed:
language-pack-de
language-pack-tr
language-pack-sv

in an attempt to make the test work, and it works all except for that
lc_time settng.

I think the language-pack packages have nothing to do with it; they only
supply translations.

Possibly, things are set up so that only UTF-8 locales are installed by
default. Since the collate.linux.utf8 requires a UTF-8 environment, it
seems reasonable to use the tr_TR.UTF-8 locale for LC_TIME, instead of
requiring an unrelated (ISO-8859-9) locale to be installed. So I think
the change you propose is reasonable.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
Re: (patch) regression diffs on collate.linux.utf8 test

Peter Eisentraut <peter_e@gmx.net> writes:

On tis, 2011-10-18 at 01:07 -0700, Jeff Davis wrote:

If I qualify it as "tr_TR.UTF-8" it works. Perhaps I have something
misconfigured on my system (Ubuntu 11.10)? I just installed:
language-pack-de
language-pack-tr
language-pack-sv
in an attempt to make the test work, and it works all except for that
lc_time settng.

I think the language-pack packages have nothing to do with it; they only
supply translations.

Possibly, things are set up so that only UTF-8 locales are installed by
default. Since the collate.linux.utf8 requires a UTF-8 environment, it
seems reasonable to use the tr_TR.UTF-8 locale for LC_TIME, instead of
requiring an unrelated (ISO-8859-9) locale to be installed. So I think
the change you propose is reasonable.

As I said to Jeff earlier, I'd rather not embed assumptions about the
spelling of encoding names into this test. So I don't want to do this
just to get rid of an unexplained failure. I don't entirely believe
the above theory, because it's not clear why Jeff's machine is behaving
differently from mine.

regards, tom lane

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: (patch) regression diffs on collate.linux.utf8 test

On tis, 2011-10-18 at 15:21 -0400, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On tis, 2011-10-18 at 01:07 -0700, Jeff Davis wrote:

If I qualify it as "tr_TR.UTF-8" it works. Perhaps I have something
misconfigured on my system (Ubuntu 11.10)? I just installed:
language-pack-de
language-pack-tr
language-pack-sv
in an attempt to make the test work, and it works all except for that
lc_time settng.

I think the language-pack packages have nothing to do with it; they only
supply translations.

Possibly, things are set up so that only UTF-8 locales are installed by
default. Since the collate.linux.utf8 requires a UTF-8 environment, it
seems reasonable to use the tr_TR.UTF-8 locale for LC_TIME, instead of
requiring an unrelated (ISO-8859-9) locale to be installed. So I think
the change you propose is reasonable.

As I said to Jeff earlier, I'd rather not embed assumptions about the
spelling of encoding names into this test. So I don't want to do this
just to get rid of an unexplained failure. I don't entirely believe
the above theory, because it's not clear why Jeff's machine is behaving
differently from mine.

Presumably because Jeff doesn't have that particular locale installed.
locale -a would clarify that.

glibc has always accepted variant locale name spellings such as "UTF-8"
vs "utf8", so it's not a problem.

#8Jeff Davis
pgsql@j-davis.com
In reply to: Peter Eisentraut (#7)
Re: (patch) regression diffs on collate.linux.utf8 test

On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:

Presumably because Jeff doesn't have that particular locale installed.
locale -a would clarify that.

$ locale -a |grep -i tr
tr_CY.utf8
tr_TR.utf8

So, yes, I only have the UTF8 version. I didn't realize they were
different -- do you happen to know what package I need for just plain
tr_TR?

Regards,
Jeff Davis

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#8)
Re: (patch) regression diffs on collate.linux.utf8 test

Jeff Davis <pgsql@j-davis.com> writes:

On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:

Presumably because Jeff doesn't have that particular locale installed.
locale -a would clarify that.

$ locale -a |grep -i tr
tr_CY.utf8
tr_TR.utf8

So, yes, I only have the UTF8 version.

Wow, that's interesting. Digging around on my Fedora box, I can't find
any suggestion that it's even possible to subdivide the locale settings
like that. I only see one source file for tr_TR --- that's
/usr/share/i18n/locales/tr_TR --- and it looks like all the stuff under
/usr/share/i18n/locales/ is compiled into one big run-time file
/usr/lib/locale/locale-archive.

regards, tom lane

#10Peter Eisentraut
peter_e@gmx.net
In reply to: Jeff Davis (#8)
Re: (patch) regression diffs on collate.linux.utf8 test

On tis, 2011-10-18 at 21:47 -0700, Jeff Davis wrote:

On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:

Presumably because Jeff doesn't have that particular locale installed.
locale -a would clarify that.

$ locale -a |grep -i tr
tr_CY.utf8
tr_TR.utf8

So, yes, I only have the UTF8 version. I didn't realize they were
different -- do you happen to know what package I need for just plain
tr_TR?

dpkg-reconfigure locales

or

apt-get install locales-all

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#9)
Re: (patch) regression diffs on collate.linux.utf8 test

On ons, 2011-10-19 at 01:10 -0400, Tom Lane wrote:

Jeff Davis <pgsql@j-davis.com> writes:

On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:

Presumably because Jeff doesn't have that particular locale installed.
locale -a would clarify that.

$ locale -a |grep -i tr
tr_CY.utf8
tr_TR.utf8

So, yes, I only have the UTF8 version.

Wow, that's interesting. Digging around on my Fedora box, I can't find
any suggestion that it's even possible to subdivide the locale settings
like that. I only see one source file for tr_TR --- that's
/usr/share/i18n/locales/tr_TR --- and it looks like all the stuff under
/usr/share/i18n/locales/ is compiled into one big run-time file
/usr/lib/locale/locale-archive.

It has "always" been the case on Debian that it doesn't blindly install
all 600+ locales provided by glibc. Instead, the OS installer picks the
ones that you are likely to use, and generates those from source at the
time the "locales" package is installed. (So the locales package
contains the source for all glibc locales, but not the binary form.)

In fact, here is the output from a vanilla Debian stable installation:

$ locale -a
C
en_US.utf8
POSIX

I suspect, and this is Ubuntu-specific, so I don't have direct
experience with it, that what happened is that when you install the
langpack packages that Jeff mentioned, it triggers the compilation of
the respective associated locales. But as you can see, apparently only
utf8 locales are generated by default, nowadays.

#12Jeff Davis
pgsql@j-davis.com
In reply to: Peter Eisentraut (#10)
Re: (patch) regression diffs on collate.linux.utf8 test

On Wed, 2011-10-19 at 11:44 +0300, Peter Eisentraut wrote:

On tis, 2011-10-18 at 21:47 -0700, Jeff Davis wrote:

On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:

Presumably because Jeff doesn't have that particular locale installed.
locale -a would clarify that.

$ locale -a |grep -i tr
tr_CY.utf8
tr_TR.utf8

So, yes, I only have the UTF8 version. I didn't realize they were
different -- do you happen to know what package I need for just plain
tr_TR?

dpkg-reconfigure locales

Did that, and still:

# locale -a|grep -i tr
tr_CY.utf8
tr_TR.utf8

apt-get install locales-all

# aptitude install locales-all
No candidate version found for locales-all
No candidate version found for locales-all
No packages will be installed, upgraded, or removed.
0 packages upgraded, 0 newly installed, 0 to remove and 80 not upgraded.
Need to get 0 B of archives. After unpacking 0 B will be used.

Regards,
Jeff Davis

#13Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#12)
Re: (patch) regression diffs on collate.linux.utf8 test

On Wed, 2011-10-19 at 10:10 -0700, Jeff Davis wrote:

dpkg-reconfigure locales

I had to manually do

# locale-gen tr_TR

to make it generate tr_TR.ISO-8859-9, and now it works.

I'm not sure what we should do, exactly, but I expect that others who
attempt to run the test on ubuntu (and maybe debian) might get confused.
I'd be fine with leaving it alone though, too.

Regards,
Jeff Davis