pgsql2shp : Encoding headache
Hi all!
I have an UTF8 encoded shapefile, and an UTF8 encoded postgis-enabled
database. I want my shapefile to be encoded in WIN1252, and a particular
field to be in uppercase.
Since I am on windows, I don't have an iconv executable. Therefore, I am
trying to :
- dump the shapefile with shp2pgsql to an sql text file
- load the resulting sql file into a postgresql table
- dump this table into a shapefile with pgsql2shp (in WIN1252)
To load the shapefile into postgresql, I had to dump it without
specifying a '-W' argument, set my client_encoding to UTF8, then load
the sql file into postgresql.
If I look at the data with pgAdmin (with client_encoding set to UTF8),
it looks good : accentuation is fine, special characters are fine.
To dump the data in a WIN1252-encoded shapefile, pgsql2shp has no
encoding argument, so I set my client encoding to WIN1252 thtough the
environment variable PGCLIENTENCODING.
If I just dump the file this way, it seems to be fine. So this command
works fine :
pgsql2shp -f myouput.shp -u postgres -g mygeom mydatabase "SELECT
mytext, mygeom FROM mytemptable"
-> [621679 rows]
But then, if I dump it through a query to have my field in uppercase, I
get an error 'character 0xc29f of encoding UTF8 has no equivalent in
WIN1252' (translated by myself, the message is in French)
The command is simply :
pgsql2shp -f myouput.shp -u postgres -g mygeom mydatabase "SELECT
upper(mytext) as mytext, mygeom FROM mytemptable"
So I guess there is a problem with my 'upper' conversion, but I have no
idea what this 0xc29f character could be.
Any help would be greatly appreciated.
Thanks a lot !
--
Arnaud Lesauvage
On 16/10/2009 10:36, Arnaud Lesauvage wrote:
I have an UTF8 encoded shapefile, and an UTF8 encoded postgis-enabled
database. I want my shapefile to be encoded in WIN1252, and a particular
field to be in uppercase.Since I am on windows, I don't have an iconv executable. Therefore, I am
trying to :
If it's any help to you, you can get iconv (and a bunch of other helpful
stuff) from GnuWin32:
http://gnuwin32.sourceforge.net/
Ray.
------------------------------------------------------------------
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
rod@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
------------------------------------------------------------------
Does that last query (invoking the upper() function) actually run well when
executed in pgsql console?
Rob
2009/10/16 Arnaud Lesauvage <arnaud.listes@codata.eu>
Show quoted text
Hi all!
I have an UTF8 encoded shapefile, and an UTF8 encoded postgis-enabled
database. I want my shapefile to be encoded in WIN1252, and a particular
field to be in uppercase.Since I am on windows, I don't have an iconv executable. Therefore, I am
trying to :
- dump the shapefile with shp2pgsql to an sql text file
- load the resulting sql file into a postgresql table
- dump this table into a shapefile with pgsql2shp (in WIN1252)To load the shapefile into postgresql, I had to dump it without specifying
a '-W' argument, set my client_encoding to UTF8, then load the sql file into
postgresql.If I look at the data with pgAdmin (with client_encoding set to UTF8), it
looks good : accentuation is fine, special characters are fine.To dump the data in a WIN1252-encoded shapefile, pgsql2shp has no encoding
argument, so I set my client encoding to WIN1252 thtough the environment
variable PGCLIENTENCODING.If I just dump the file this way, it seems to be fine. So this command
works fine :
pgsql2shp -f myouput.shp -u postgres -g mygeom mydatabase "SELECT mytext,
mygeom FROM mytemptable"
-> [621679 rows]But then, if I dump it through a query to have my field in uppercase, I get
an error 'character 0xc29f of encoding UTF8 has no equivalent in WIN1252'
(translated by myself, the message is in French)
The command is simply :
pgsql2shp -f myouput.shp -u postgres -g mygeom mydatabase "SELECT
upper(mytext) as mytext, mygeom FROM mytemptable"So I guess there is a problem with my 'upper' conversion, but I have no
idea what this 0xc29f character could be.Any help would be greatly appreciated.
Thanks a lot !--
Arnaud Lesauvage
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
Raymond O'Donnell a �crit :
If it's any help to you, you can get iconv (and a bunch of other helpful
stuff) from GnuWin32:
Thanks for your help Raymond.
I tried iconv but I have other problems now.
I still have to load the file into postgresql because the shapefiles
datafile (.dbf) is associated with an index file, and I have to use
pgsql2shp to rebuild it.
I'll try some more though. Maybe iconv before shp2pgsql, then load with
client_encoding set to WIN1252, then dump.
I'll see how it goes.
Arnaud
Arnaud Lesauvage a �crit :
But then, if I dump it through a query to have my field in uppercase, I
get an error 'character 0xc29f of encoding UTF8 has no equivalent in
WIN1252' (translated by myself, the message is in French)
The command is simply :
pgsql2shp -f myouput.shp -u postgres -g mygeom mydatabase "SELECT
upper(mytext) as mytext, mygeom FROM mytemptable"
OK, I narrowed down the problem to the WIN1252 encoding.
Using LATIN1 or LATIN9 for instance works correctly.
Since my application seems to work with LATIN9, I'll go with it.
I am still perplex though. What is this 0xc29f character ? An internet
search tells me that this is some Kanju character. I am quite confident
that if this is true, it would not convert any better to LATIN9 than to
WIN1252.
Also, doing a search like :
SELECT * FROM mytable WHERE upper(myflied) ILIKE u&'%c29f%';
Gives me 0 result.
Am I wrong to think that the error 'character 0xc29f of UTF8' relates to
the character with code point C29F in UTF8 ?
Thanks again for your help/lightings on this matter.
Arnaud
I would do this last query searching for the 0xC29F character WITHOUT the
upper() function on the source table, in the native (to table) UTF8 client
encoding. No result either?
Rob
2009/10/16 Arnaud Lesauvage <arnaud.listes@codata.eu>
Show quoted text
Arnaud Lesauvage a écrit :
But then, if I dump it through a query to have my field in uppercase, I
get an error 'character 0xc29f of encoding UTF8 has no equivalent in
WIN1252' (translated by myself, the message is in French)
The command is simply :
pgsql2shp -f myouput.shp -u postgres -g mygeom mydatabase "SELECT
upper(mytext) as mytext, mygeom FROM mytemptable"OK, I narrowed down the problem to the WIN1252 encoding.
Using LATIN1 or LATIN9 for instance works correctly.
Since my application seems to work with LATIN9, I'll go with it.I am still perplex though. What is this 0xc29f character ? An internet
search tells me that this is some Kanju character. I am quite confident that
if this is true, it would not convert any better to LATIN9 than to WIN1252.Also, doing a search like :
SELECT * FROM mytable WHERE upper(myflied) ILIKE u&'%c29f%';
Gives me 0 result.
Am I wrong to think that the error 'character 0xc29f of UTF8' relates to
the character with code point C29F in UTF8 ?Thanks again for your help/lightings on this matter.
Arnaud
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
Hey everyone,
I apologize in advance for going slightly off topic, but I have never
setup a centralized authentication scheme under Linux. My question is,
what do most people do for centralized command line, X, and PG
authentication? From what I've read the main choices are NIS or LDAP.
LDAP would be problematic as I would have to embed a login and plain
text password in the ldap.conf file for binding to the MS AD. On the
other hand, it seems like NIS is old, inflexible, outdated, and possibly
nearing end of life. We are a largely Windows shop with many app and
database servers running Linux. The Linux environment is growing too
large not to do centralized authentication of some kind.
At this point I'm open to suggestions or comments. SSH and X are
required, PG would be nice to be able to auth centrally as well while
I'm at it.
Thanks,
Scot Kreienkamp
On Fri, Oct 16, 2009 at 2:38 PM, Scot Kreienkamp <SKreien@la-z-boy.com>wrote:
Hey everyone,
I apologize in advance for going slightly off topic, but I have never setup
a centralized authentication scheme under Linux. My question is, what do
most people do for centralized command line, X, and PG authentication? From
what I’ve read the main choices are NIS or LDAP. LDAP would be problematic
as I would have to embed a login and plain text password in the ldap.conf
file for binding to the MS AD.
It sounds like PAM would be useful for you. That's really what is was
built for.
--Scott
These are all RH4 and 5, so they do all have PAM. I thought PAM had to
interface with something else, which is where NIS and LDAP enter the
picture, to authenticate to another server though. Otherwise I'm not
sure how it works?
Thanks,
Scot Kreienkamp
skreien@la-z-boy.com
From: Scott Mead [mailto:scott.lists@enterprisedb.com]
Sent: Friday, October 16, 2009 2:50 PM
To: Scot Kreienkamp
Cc: pgsql-general
Subject: Re: [GENERAL] slightly off-topic: Central Auth
On Fri, Oct 16, 2009 at 2:38 PM, Scot Kreienkamp <SKreien@la-z-boy.com>
wrote:
Hey everyone,
I apologize in advance for going slightly off topic, but I have never
setup a centralized authentication scheme under Linux. My question is,
what do most people do for centralized command line, X, and PG
authentication? From what I've read the main choices are NIS or LDAP.
LDAP would be problematic as I would have to embed a login and plain
text password in the ldap.conf file for binding to the MS AD.
It sounds like PAM would be useful for you. That's really what is
was built for.
--Scott
On 16/10/2009 19:38, Scot Kreienkamp wrote:
Hey everyone,
I apologize in advance for going slightly off topic, but I have never
setup a centralized authentication scheme under Linux. My question is,
what do most people do for centralized command line, X, and PG
authentication? From what I've read the main choices are NIS or LDAP.
LDAP would be problematic as I would have to embed a login and plain
text password in the ldap.conf file for binding to the MS AD. On the
other hand, it seems like NIS is old, inflexible, outdated, and possibly
nearing end of life. We are a largely Windows shop with many app and
database servers running Linux. The Linux environment is growing too
large not to do centralized authentication of some kind.At this point I'm open to suggestions or comments. SSH and X are
required, PG would be nice to be able to auth centrally as well while
I'm at it.
Does "PG" = PostgreSQL? If so, it can do LDAP, Kerberos and PAM, among
other things:
http://www.postgresql.org/docs/8.4/static/client-authentication.html
Ray.
--
-- Raymond O'Donnell
-- Tela Design Ltd, Craughwell, Co. Galway, Ireland.
-- Software & graphic design and consultancy
-- ray@teladesign.ie
--
On 16/10/2009 19:38, Scot Kreienkamp wrote:
Hey everyone,
I apologize in advance for going slightly off topic, but I have never
setup a centralized authentication scheme under Linux. My question
is,
what do most people do for centralized command line, X, and PG
authentication? From what I've read the main choices are NIS or LDAP.
LDAP would be problematic as I would have to embed a login and plain
text password in the ldap.conf file for binding to the MS AD. On the
other hand, it seems like NIS is old, inflexible, outdated, and
possibly
nearing end of life. We are a largely Windows shop with many app and
database servers running Linux. The Linux environment is growing too
large not to do centralized authentication of some kind.At this point I'm open to suggestions or comments. SSH and X are
required, PG would be nice to be able to auth centrally as well while
I'm at it.
Does "PG" = PostgreSQL? If so, it can do LDAP, Kerberos and PAM, among
other things:
[Scot Kreienkamp]
But of course. :)
So I guess what I see taking shape is setting up everything to auth
against PAM locally, then setting up local PAM to auth to a remote
source.
Thanks,
Scot Kreienkamp
skreien@la-z-boy.com
"Scot Kreienkamp" <SKreien@la-z-boy.com> writes:
On 16/10/2009 19:38, Scot Kreienkamp wrote:
... We are a largely Windows shop with many app and
database servers running Linux. The Linux environment is growing too
large not to do centralized authentication of some kind.
So I guess what I see taking shape is setting up everything to auth
against PAM locally, then setting up local PAM to auth to a remote
source.
What are you using for central auth in the Windows portions of your
shop?
What I'd suggest is that you standardize on Kerberos auth (that's what
it's called in the Unix world, MS might have another name for it).
You can definitely plug Linux into an Active Directory server for this,
and I believe that you have the option to switch it around in future
if you decide you'd rather have a Linux machine as your central auth
server.
If you decide to go with this approach and use PAM as intermediary,
you'll need the patch I just committed in response to bug #5121 --- it
turns out nobody had ever tried that with Postgres before :-(. But
I think it's also possible to just use PG's native Kerberos support
with AD, which would explain why nobody had tried it.
regards, tom lane
2009/10/17 Tom Lane <tgl@sss.pgh.pa.us>:
If you decide to go with this approach and use PAM as intermediary,
you'll need the patch I just committed in response to bug #5121 --- it
turns out nobody had ever tried that with Postgres before :-(. But
I think it's also possible to just use PG's native Kerberos support
with AD, which would explain why nobody had tried it.
It most definitely is possible. Note that the recommendation is to
use GSSAPI with Kerberos and not the native "krb5" authentication for
tihs if possible.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 16/10/2009 20:38, Raymond O'Donnell wrote:
Does "PG" = PostgreSQL? If so, it can do LDAP, Kerberos and PAM, among
other things:
Apologies for the noise - I thought I was replying to a non-PG list. :-)
Ray.
------------------------------------------------------------------
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
rod@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
------------------------------------------------------------------
InterRob a �crit :
Arnaud Lesauvage a �crit :
Also, doing a search like :
SELECT * FROM mytable WHERE upper(myflied) ILIKE u&'%c29f%';
Gives me 0 result.
Am I wrong to think that the error 'character 0xc29f of UTF8' relates
to the character with code point C29F in UTF8 ?I would do this last query searching for the 0xC29F character WITHOUT the
upper() function on the source table, in the native (to table) UTF8 client
encoding. No result either?
No, no result either.
What puzzles me is that this field contains names of French streets, and
I can't see what character could be in such a field that has no
equivalent in win1252 !
--
Arnaud Lesauvage
IT Executive
Codata Belgium SA
143-145, Avenue bourgmestre Jean Materne - 5100 Namur - Belgium
Tel : +32 81 21 53 48 - Fax : +32 81 21 54 24
Mel : arnaud.lesauvage@codata.eu
Web : http://www.codata.eu/