ignore errors for COPY
Hello
I have postrges 8.3.12 and I have the following issue:
I have a table
create table test(
table_id integer,
datetime timestamp,
MMSI integer,
lat real,
lng real,
);
and I bulk insert data to this table with COPY.
A tiny portion of the data in the file are wrong. For example one date
is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
encounters this row and stop the insertion with
ERROR: date/time field value out of range: "2009-93-29 05:27:08"
Is there a way I can "turn" this error into a warning (or suppress the
error) and make COPY simply to skip this row?
I have a big amount of data (~100G) so iterating through them to find
all the possible wrong timestamp, reals, and integers will be quite
tedious and time consuming.
Regards
Vangelis
Le 11/11/2010 13:01, Vangelis Katsikaros a �crit :
Hello
I have postrges 8.3.12 and I have the following issue:
I have a table
create table test(
table_id integer,
datetime timestamp,
MMSI integer,
lat real,
lng real,
);and I bulk insert data to this table with COPY.
A tiny portion of the data in the file are wrong. For example one date
is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
encounters this row and stop the insertion with
ERROR: date/time field value out of range: "2009-93-29 05:27:08"Is there a way I can "turn" this error into a warning (or suppress the
error) and make COPY simply to skip this row?
Nope.
I have a big amount of data (~100G) so iterating through them to find
all the possible wrong timestamp, reals, and integers will be quite
tedious and time consuming.
You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.
--
Guillaume
http://www.postgresql.fr
http://dalibo.com
On Thu, Nov 11, 2010 at 8:05 AM, Guillaume Lelarge
<guillaume@lelarge.info> wrote:
Le 11/11/2010 13:01, Vangelis Katsikaros a écrit :
Hello
I have postrges 8.3.12 and I have the following issue:
I have a table
create table test(
table_id integer,
datetime timestamp,
MMSI integer,
lat real,
lng real,
);and I bulk insert data to this table with COPY.
A tiny portion of the data in the file are wrong. For example one date
is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
encounters this row and stop the insertion with
ERROR: date/time field value out of range: "2009-93-29 05:27:08"Is there a way I can "turn" this error into a warning (or suppress the
error) and make COPY simply to skip this row?Nope.
I have a big amount of data (~100G) so iterating through them to find
all the possible wrong timestamp, reals, and integers will be quite
tedious and time consuming.You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.--
Guillaume
http://www.postgresql.fr
http://dalibo.com--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
olé!!!
Maybe you could import it as a text column and then deal with the
conversion in the DB....i do that sometimes....never had 100GB of data
to work with though....
Rhys
On 11/11/2010 03:05 PM, Guillaume Lelarge wrote:
You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.
Hi
Thanks for the answer, indeed it solves the issue.
Now, I would like to ask a second question (sorry for using the same
thread :)
I would like to create a specific database on a another location (not in
"data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
I have to run
initdb -D /other/path/
of if there is another way
(for example in mysql I can do the same thing by creating a symlink to
the other location inside "datadir = /var/lib/mysql" )
Regards
Vangelis
Le 11/11/2010 17:46, Vangelis Katsikaros a �crit :
On 11/11/2010 03:05 PM, Guillaume Lelarge wrote:
You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.Hi
Thanks for the answer, indeed it solves the issue.
Great.
Now, I would like to ask a second question (sorry for using the same
thread :)I would like to create a specific database on a another location (not in
"data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
I have to run
initdb -D /other/path/of if there is another way
To create another database in another location, you first need to create
a tablespace, and then create a database in that tablespace. For
example, in psql:
CREATE TABLESPACE otherlocation LOCATION '/some/specific/directory';
CREATE DATABASE newdb TABLESPACE otherlocation;
(for example in mysql I can do the same thing by creating a symlink to
the other location inside "datadir = /var/lib/mysql" )
Don't know MySQL, so can't say :)
--
Guillaume
http://www.postgresql.fr
http://dalibo.com
On 11/11/2010 08:15 PM, Guillaume Lelarge wrote:
I would like to create a specific database on a another location (not in
"data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
I have to run
initdb -D /other/path/of if there is another way
To create another database in another location, you first need to create
a tablespace, and then create a database in that tablespace. For
example, in psql:CREATE TABLESPACE otherlocation LOCATION '/some/specific/directory';
CREATE DATABASE newdb TABLESPACE otherlocation;
Hey thanks again, that did the job :)