Import German Decimal Numbers
Hi!
I want to import some data from an ascii file using the COPY sql-command.
Unfortunatly the decimal numbers are given in german format, meaning the
decimal point is replaced by a comma (, instead of .). Is there any
possiblility
to switch the clients behaviourr (like 'set datestyle') for
interpretation of decimals?
Thanks!
best regards,
Christian Dittmer
am Wed, dem 07.05.2008, um 14:25:13 +0200 mailte Jan Christian Dittmer folgendes:
Hi!
I want to import some data from an ascii file using the COPY sql-command.
Unfortunatly the decimal numbers are given in german format, meaning the
decimal point is replaced by a comma (, instead of .). Is there any
possiblility
to switch the clients behaviourr (like 'set datestyle') for
interpretation of decimals?
Don't know, but you can replace the , to . within the ascii-file (sed,
awk, ...).
Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
I would replace the ',' with something else such as a '#' first then
replace the decimal with the ',' then replace the '#' with a decimal '.'
If you do the ',' with a '.' first then all of them will be '.' and you
wont know which ones to change.
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of A. Kretschmer
Sent: Wednesday, May 07, 2008 9:38 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Import German Decimal Numbers
am Wed, dem 07.05.2008, um 14:25:13 +0200 mailte Jan Christian Dittmer
folgendes:
Hi!
I want to import some data from an ascii file using the COPY
sql-command.
Unfortunatly the decimal numbers are given in german format, meaning
the
decimal point is replaced by a comma (, instead of .). Is there any
possiblility
to switch the clients behaviourr (like 'set datestyle') for
interpretation of decimals?
Don't know, but you can replace the , to . within the ascii-file (sed,
awk, ...).
Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
This message has been scanned by MailScanner
Thank you very much!
You have remind me that the our server runs under Linux and not under
Windows as our clients :-)
So indeed I can use a sed-pipe construct to switch '.' and ','.
But wait, there is just another problem then. Our date format is also
german :-( "DD.MM.YY" or
"DD.MM.YYYY". So if I just exchange '.' and ',' the date will be
unreadable for the import :-(
The (current) file is 1.4 GB so it will take ages to let awk chew on it
I guess.
Christian
Ken Allen wrote:
Show quoted text
I would replace the ',' with something else such as a '#' first then
replace the decimal with the ',' then replace the '#' with a decimal '.'If you do the ',' with a '.' first then all of them will be '.' and you
wont know which ones to change.
Don't know, but you can replace the , to . within the ascii-file (sed,
awk, ...).Andreas
Well if your doing an update, do it column by column and when you do a
date column replace the '.' with '/'
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Jan Christian
Dittmer
Sent: Wednesday, May 07, 2008 10:16 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Import German Decimal Numbers
Thank you very much!
You have remind me that the our server runs under Linux and not under
Windows as our clients :-)
So indeed I can use a sed-pipe construct to switch '.' and ','.
But wait, there is just another problem then. Our date format is also
german :-( "DD.MM.YY" or
"DD.MM.YYYY". So if I just exchange '.' and ',' the date will be
unreadable for the import :-(
The (current) file is 1.4 GB so it will take ages to let awk chew on it
I guess.
Christian
Ken Allen wrote:
I would replace the ',' with something else such as a '#' first then
replace the decimal with the ',' then replace the '#' with a decimal
'.'
If you do the ',' with a '.' first then all of them will be '.' and
you
wont know which ones to change.
Don't know, but you can replace the , to . within the ascii-file (sed,
awk, ...).Andreas
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
This message has been scanned by MailScanner
Jan Christian Dittmer wrote:
Thank you very much!
You have remind me that the our server runs under Linux and not under
Windows as our clients :-)
So indeed I can use a sed-pipe construct to switch '.' and ','.
But wait, there is just another problem then. Our date format is also
german :-( "DD.MM.YY" or
"DD.MM.YYYY". So if I just exchange '.' and ',' the date will be
unreadable for the import :-(
The (current) file is 1.4 GB so it will take ages to let awk chew on it
I guess.
If all fails you could copy into a another table where you use
text columns and then translate afterwards via to_date() and
to_number() / replace().
Cheers
Tino
On Mittwoch, 7. Mai 2008, Jan Christian Dittmer wrote:
| The (current) file is 1.4 GB so it will take ages to let awk chew on it
| I guess.
If you think awk is a bottleneck, I'd recommend using perl instead. It's waaay
faster and should process your file within minutes if not faster.
Ciao,
Thomas
--
Thomas Pundt <thomas.pundt@rp-online.de> ---- http://rp-online.de/ ----
Thomas Pundt wrote:
On Mittwoch, 7. Mai 2008, Jan Christian Dittmer wrote:
| The (current) file is 1.4 GB so it will take ages to let awk chew on it
| I guess.If you think awk is a bottleneck, I'd recommend using perl instead. It's waaay
faster and should process your file within minutes if not faster.
isnt perl just another frontend for awk? ;)) SCNR ;)
There is also http://pgfoundry.org/projects/pgloader/
and if not already implemented it should be fairly
easy to implement a data filter within this one.
Cheers
Tino
Jan Christian Dittmer wrote:
Thank you very much!
You have remind me that the our server runs under Linux and not under Windows as our clients :-)
So indeed I can use a sed-pipe construct to switch '.' and ','.
But wait, there is just another problem then. Our date format is also german :-( "DD.MM.YY" or
"DD.MM.YYYY". So if I just exchange '.' and ',' the date will be unreadable for the import :-(
The (current) file is 1.4 GB so it will take ages to let awk chew on it I guess.If all fails you could copy into a another table where you use
text columns and then translate afterwards via to_date() and
to_number() / replace().
For my tuppence worth, this is the approach I take. Copy into a
sandbox table first, as the only reliable thing about external input
is it won't be as it's supposed to be.
This way you can copy everything into varchar columns, then run any
number of regexes over the data to ensure it's perfect, then do:
insert into livetable select * from temptable (with the appropriate
casts in place).
Bit of a tangent, but Is there any possibility of SQL injection via
data provided to copy?
Will T
-
Import Notes
Resolved by subject fallback
William Temperley wrote:
...
Bit of a tangent, but Is there any possibility of SQL injection via
data provided to copy?
depends on how you call COPY, but usually not :-)
Cheers
Tino
Hi,
Le 7 mai 08 à 15:57, Tino Wildenhain a écrit :
There is also http://pgfoundry.org/projects/pgloader/
and if not already implemented it should be fairly
easy to implement a data filter within this one.
pgloader indeed support user reformating modules, and comes with a
mysql to pgsql timestamp reformater. Adding a python .py module
containing one function to handle the change should be easy, the
documentation has needed details if you look for "reformat_path" and
"reformat" options.
http://pgloader.projects.postgresql.org/
Plus, pgloader supports setting the DateStyle before running copy,
maybe this will be enough in your case? (didn't read all the thread).
If you need more help than current documentation to setup your
reformating module, please just ask!
Regards,
--
dim