Bug in COPY from CSV?

Started by Rick Schumeyerabout 21 years ago3 messagesgeneral
Jump to latest
#1Rick Schumeyer
rschumeyer@ieee.org

I think I've found a bug in PG 8.0 that occurs while copying from CSV files.

I checked the bugs list but didn't see anything similar.

This occurs when reading a CSV file where one of the 'text' fields has

a blank line. I included an example that shows the problem below.

I found that if I change the blank line to have one space, pg reads

it just fine.

If this is indeed a bug, if someone could let me know the best

way to address it, I would appreciate it.

--EXAMPLE

create table t (

id integer,

description text

);

copy t from stdin with null as '' csv quote as '"';

1,"Now is the time"

2,"for all good men

to come

to the

aid of their party"

3,"The quick brown fox"

\.

--END EXAMPLE

The above example produces this output:

CREATE TABLE

psql:test2.sql:8: ERROR: unterminated CSV quoted field

CONTEXT: COPY t, line 5: ""

#2Bruce Momjian
bruce@momjian.us
In reply to: Rick Schumeyer (#1)
Re: Bug in COPY from CSV?

I did some research on this and it turns out it is one of our TODO
items. It is:

o Allow COPY FROM ... CSV to interpret newlines and carriage
returns in data

This would require major refactoring of the copy source code.

We are actually careful to warn people who dump out data with newlines
in CSV format:

test=> insert into t values (4, 'lkjasdf
test'> lkjasdf
test'>
test'>
test'>
test'> aaaa');
INSERT 542038 1
test=> copy t to '/bjm/3' with null as '' csv quote as '"';
WARNING: CSV fields with embedded linefeed or carriage return characters might not be able to be reimported
COPY

Someone has come up with a patch which might allow this so it might work
in 8.1.

I am attaching a reproducable case of your report.

---------------------------------------------------------------------------

Rick Schumeyer wrote:

I think I've found a bug in PG 8.0 that occurs while copying from CSV files.

I checked the bugs list but didn't see anything similar.

This occurs when reading a CSV file where one of the 'text' fields has

a blank line. I included an example that shows the problem below.

I found that if I change the blank line to have one space, pg reads

it just fine.

If this is indeed a bug, if someone could let me know the best

way to address it, I would appreciate it.

--EXAMPLE

create table t (

id integer,

description text

);

copy t from stdin with null as '' csv quote as '"';

1,"Now is the time"

2,"for all good men

to come

to the

aid of their party"

3,"The quick brown fox"

\.

--END EXAMPLE

The above example produces this output:

CREATE TABLE

psql:test2.sql:8: ERROR: unterminated CSV quoted field

CONTEXT: COPY t, line 5: ""

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/0text/plainDownload
#3Guy Fraser
guy@incentre.net
In reply to: Bruce Momjian (#2)
Re: Bug in COPY from CSV?

If you have command line support for PHP it is fairly easy to
write a program that would use "pg_escape_string" to condition
the data. You could even use "pg_copy_to" to send the data
to the db as well.

On Mon, 2005-14-02 at 22:30 -0500, Bruce Momjian wrote:

I did some research on this and it turns out it is one of our TODO
items. It is:

o Allow COPY FROM ... CSV to interpret newlines and carriage
returns in data

This would require major refactoring of the copy source code.

We are actually careful to warn people who dump out data with newlines
in CSV format:

test=> insert into t values (4, 'lkjasdf
test'> lkjasdf
test'>
test'>
test'>
test'> aaaa');
INSERT 542038 1
test=> copy t to '/bjm/3' with null as '' csv quote as '"';
WARNING: CSV fields with embedded linefeed or carriage return characters might not be able to be reimported
COPY

Someone has come up with a patch which might allow this so it might work
in 8.1.

I am attaching a reproducable case of your report.

---------------------------------------------------------------------------

Rick Schumeyer wrote:

I think I've found a bug in PG 8.0 that occurs while copying from CSV files.

I checked the bugs list but didn't see anything similar.

This occurs when reading a CSV file where one of the 'text' fields has

a blank line. I included an example that shows the problem below.

I found that if I change the blank line to have one space, pg reads

it just fine.

If this is indeed a bug, if someone could let me know the best

way to address it, I would appreciate it.

--EXAMPLE

create table t (

id integer,

description text

);

copy t from stdin with null as '' csv quote as '"';

1,"Now is the time"

2,"for all good men

to come

to the

aid of their party"

3,"The quick brown fox"

\.

--END EXAMPLE

The above example produces this output:

CREATE TABLE

psql:test2.sql:8: ERROR: unterminated CSV quoted field

CONTEXT: COPY t, line 5: ""

plain text document attachment (/bjm/0)
CREATE TABLE t (
id INTEGER,
description TEXT
);
COPY T FROM stdin WITH NULL AS '' CSV QUOTE AS '"';
1,"Now is the time"
2,"for all good men
to come
to the

aid of their party"
3,"The quick brown fox"
\.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Guy Fraser
Network Administrator
The Internet Centre
1-888-450-6787
(780)450-6787