ignore errors for COPY

Started by Vangelis Katsikarosover 15 years ago6 messagesgeneral
Jump to latest
#1Vangelis Katsikaros
ibob17@yahoo.gr

Hello

I have postrges 8.3.12 and I have the following issue:

I have a table
create table test(
table_id integer,
datetime timestamp,
MMSI integer,
lat real,
lng real,
);

and I bulk insert data to this table with COPY.

A tiny portion of the data in the file are wrong. For example one date
is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
encounters this row and stop the insertion with
ERROR: date/time field value out of range: "2009-93-29 05:27:08"

Is there a way I can "turn" this error into a warning (or suppress the
error) and make COPY simply to skip this row?

I have a big amount of data (~100G) so iterating through them to find
all the possible wrong timestamp, reals, and integers will be quite
tedious and time consuming.

Regards
Vangelis

#2Guillaume Lelarge
guillaume@lelarge.info
In reply to: Vangelis Katsikaros (#1)
Re: ignore errors for COPY

Le 11/11/2010 13:01, Vangelis Katsikaros a �crit :

Hello

I have postrges 8.3.12 and I have the following issue:

I have a table
create table test(
table_id integer,
datetime timestamp,
MMSI integer,
lat real,
lng real,
);

and I bulk insert data to this table with COPY.

A tiny portion of the data in the file are wrong. For example one date
is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
encounters this row and stop the insertion with
ERROR: date/time field value out of range: "2009-93-29 05:27:08"

Is there a way I can "turn" this error into a warning (or suppress the
error) and make COPY simply to skip this row?

Nope.

I have a big amount of data (~100G) so iterating through them to find
all the possible wrong timestamp, reals, and integers will be quite
tedious and time consuming.

You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.

--
Guillaume
http://www.postgresql.fr
http://dalibo.com

#3Rhys A.D. Stewart
rhys.stewart@gmail.com
In reply to: Guillaume Lelarge (#2)
Re: ignore errors for COPY

On Thu, Nov 11, 2010 at 8:05 AM, Guillaume Lelarge
<guillaume@lelarge.info> wrote:

Le 11/11/2010 13:01, Vangelis Katsikaros a écrit :

Hello

I have postrges 8.3.12 and I have the following issue:

I have a table
create table test(
     table_id integer,
     datetime timestamp,
     MMSI integer,
     lat real,
     lng real,
);

and I bulk insert data to this table with COPY.

A tiny portion of the data in the file are wrong. For example one date
is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
encounters this row and stop the insertion with
ERROR:  date/time field value out of range: "2009-93-29 05:27:08"

Is there a way I can "turn" this error into a warning (or suppress the
error) and make COPY simply to skip this row?

Nope.

I have a big amount of data (~100G) so iterating through them to find
all the possible wrong timestamp, reals, and integers will be quite
tedious and time consuming.

You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.

--
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

olé!!!

Maybe you could import it as a text column and then deal with the
conversion in the DB....i do that sometimes....never had 100GB of data
to work with though....

Rhys

#4Vangelis Katsikaros
ibob17@yahoo.gr
In reply to: Guillaume Lelarge (#2)
Re: ignore errors for COPY [solved]

On 11/11/2010 03:05 PM, Guillaume Lelarge wrote:

You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.

Hi

Thanks for the answer, indeed it solves the issue.

Now, I would like to ask a second question (sorry for using the same
thread :)

I would like to create a specific database on a another location (not in
"data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
I have to run
initdb -D /other/path/

of if there is another way

(for example in mysql I can do the same thing by creating a symlink to
the other location inside "datadir = /var/lib/mysql" )

Regards
Vangelis

#5Guillaume Lelarge
guillaume@lelarge.info
In reply to: Vangelis Katsikaros (#4)
Re: ignore errors for COPY [solved]

Le 11/11/2010 17:46, Vangelis Katsikaros a �crit :

On 11/11/2010 03:05 PM, Guillaume Lelarge wrote:

You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.

Hi

Thanks for the answer, indeed it solves the issue.

Great.

Now, I would like to ask a second question (sorry for using the same
thread :)

I would like to create a specific database on a another location (not in
"data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
I have to run
initdb -D /other/path/

of if there is another way

To create another database in another location, you first need to create
a tablespace, and then create a database in that tablespace. For
example, in psql:

CREATE TABLESPACE otherlocation LOCATION '/some/specific/directory';
CREATE DATABASE newdb TABLESPACE otherlocation;

(for example in mysql I can do the same thing by creating a symlink to
the other location inside "datadir = /var/lib/mysql" )

Don't know MySQL, so can't say :)

--
Guillaume
http://www.postgresql.fr
http://dalibo.com

#6Vangelis Katsikaros
ibob17@yahoo.gr
In reply to: Guillaume Lelarge (#5)
Re: ignore errors for COPY [solved]

On 11/11/2010 08:15 PM, Guillaume Lelarge wrote:

I would like to create a specific database on a another location (not in
"data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
I have to run
initdb -D /other/path/

of if there is another way

To create another database in another location, you first need to create
a tablespace, and then create a database in that tablespace. For
example, in psql:

CREATE TABLESPACE otherlocation LOCATION '/some/specific/directory';
CREATE DATABASE newdb TABLESPACE otherlocation;

Hey thanks again, that did the job :)