msvc++ build of 8.2.4 and encodings

Started by Charlie Savageover 18 years ago5 messages
#1Charlie Savage
cfis@savagexi.com

Hope this is the right place for this post...

I'm been trying out the msvc++ build scripts for postgresql 8.2.4 on my
development laptop (using window xp pro).

I noticed the sort orders of queries changed. Investigating more,
encodings don't seem to be working as expected.

Using a MSVC++ build:

CREATE DATABASE test1 WITH ENCODING = 'utf8';

show all

"lc_collate";"English_United States.1252"
"lc_ctype";"English_United States.1252"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

Using a MSYS build:

CREATE DATABASE test1 WITH ENCODING = 'utf8';

show all

"lc_collate";"en_US.UTF-8"
"lc_ctype";"en_US.UTF-8"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

In both cases, the database clusters were created like this:

initdb ---locale=c --encoding=utf8;

Note that I successfully built all the various encoding projects for the
MSVC++ build and have installed them.

I'd be happy to debug this a bit more if would be helpful.

Thanks,

Charlie

#2Charlie Savage
cfis@savagexi.com
In reply to: Charlie Savage (#1)
Re: msvc++ build of 8.2.4 and encodings

Using a MSYS build:

CREATE DATABASE test1 WITH ENCODING = 'utf8';

show all

"lc_collate";"en_US.UTF-8"
"lc_ctype";"en_US.UTF-8"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

Sorry, the above output is for Linux (Fedora Core 6). With an MSYS
build on my XP laptop its:

"lc_collate";"C"
"lc_ctype";"C"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

Still different than the MSVC++ build.

Thanks,

Charlie

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Charlie Savage (#1)
Re: msvc++ build of 8.2.4 and encodings

Charlie Savage wrote:

Hope this is the right place for this post...

I'm been trying out the msvc++ build scripts for postgresql 8.2.4 on
my development laptop (using window xp pro).

I noticed the sort orders of queries changed. Investigating more,
encodings don't seem to be working as expected.

Using a MSVC++ build:

CREATE DATABASE test1 WITH ENCODING = 'utf8';

show all

"lc_collate";"English_United States.1252"
"lc_ctype";"English_United States.1252"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

Using a MSYS build:

CREATE DATABASE test1 WITH ENCODING = 'utf8';

show all

"lc_collate";"en_US.UTF-8"
"lc_ctype";"en_US.UTF-8"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

In both cases, the database clusters were created like this:

initdb ---locale=c --encoding=utf8;

That seems most unlikely - without the superfluous dash it should set
both lc_collate and lc_ctype to C.

Please try the following in both cases:

initdb --no-locale --encoding=utf8 data
pg_controldata data | grep LC_

If it doesn't show this:

LC_COLLATE: C
LC_CTYPE: C

then that's a bug. Or if after that you connect to the instance and
"show lc_collate" or "show lc_ctype" don't likewise show C then that's a
bug.

Are you by any chance loading a library that calls setlocale() ?

cheers

andrew

#4Charlie Savage
cfis@savagexi.com
In reply to: Andrew Dunstan (#3)
Re: msvc++ build of 8.2.4 and encodings

Hi Andrew,

Thank for the reply.

In both cases, the database clusters were created like this:

initdb ---locale=c --encoding=utf8;

That seems most unlikely - without the superfluous dash it should set
both lc_collate and lc_ctype to C.

Ah, sorry, that was a typo. If you actually try it:

C:\WINDOWS\system32>initdb ---locale=C --encoding=utf8 c:\data_msvcc3
initdb: illegal option -- -locale=C

Please try the following in both cases:

initdb --no-locale --encoding=utf8 data
pg_controldata data | grep LC_

If it doesn't show this:

LC_COLLATE: C
LC_CTYPE: C

then that's a bug.

With MSYS build:

initdb --no-locale --encoding=utf8 c:\data_msys

C:\WINDOWS\system32>pg_controldata c:\data_msys | grep LC_
LC_COLLATE: C
LC_CTYPE: C

[connect to postgres database]
show lc_collate C
show lc_ctype C

create database test with encoding='utf8'

[switch to postgres database]
show lc_collate C
show lc_ctype C

With VC++ build:

initdb --no-locale --encoding=utf8 c:\data_msvcc

C:\WINDOWS\system32>pg_controldata c:\data_msvcc | grep LC_
LC_COLLATE: C
LC_CTYPE: C

show lc_collate C
show lc_ctype C

create database test with encoding='utf8'

[switch to postgres database]
show lc_collate C
show lc_ctype C

Ok, so this works.

And if I use --locale=C for initdb it gives the same answers.

Are you by any chance loading a library that calls setlocale() ?

Hmm. Its postgresql 8.2.4 + tsearch2 + tree + postgis. postgis in
turn loads proj4 and geos. I grepped through those 3 libraries source
code and did not find any calls to setlocale. So I don't think so.

So now I'm confused - if I go back to my other cluster that I originally
wrote about (created with the MSVC++ build also) and create a database
it has a different lc_collate (English_United States.1252"). Could this
be from the dump/reload?

Charlie

#5Magnus Hagander
magnus@hagander.net
In reply to: Charlie Savage (#4)
Re: msvc++ build of 8.2.4 and encodings

On Wed, Aug 29, 2007 at 09:49:03PM -0600, Charlie Savage wrote:

Hmm. Its postgresql 8.2.4 + tsearch2 + tree + postgis. postgis in
turn loads proj4 and geos. I grepped through those 3 libraries source
code and did not find any calls to setlocale. So I don't think so.

So now I'm confused - if I go back to my other cluster that I originally
wrote about (created with the MSVC++ build also) and create a database
it has a different lc_collate (English_United States.1252"). Could this
be from the dump/reload?

Shouldn't be - it's set a initdb and not at reload. My guess would be that
you somehow missed the locale parameter on that initdb call - I don't
suppose you still have it in yuor commandline history? :_)

There should be zero difference in what initdb does, and I've never seen
anything like that other than when I missed some option to it.

//Magnus