How to ignore blank lines with file_fdw

Started by Nicklas Avénalmost 12 years ago3 messagesgeneral
Jump to latest
#1Nicklas Avén
nicklas.aven@jordogskog.no

Hallo

I am struggling to find the best solution to ignore blank lines in csv-file when using file_fdw.

A blank line makes the table unreadable.

I would like to avoid manipulating the file directly and avoid the need to make a new corrected copy of the file.

I am on Linux so I have found a solution when using COPY:
COPY test_table from program 'sed ''/^ *$/d'' /opt/builds/inotify_test/test.csv' with (format 'csv', header 'true');

but since the "program" option not seems to be implemented in file_fdw I am still searching for a solution.

I have also found in an email from 2011
/messages/by-id/4E699DE6.8010606@gmail.com

that when force_not_null was implemented in file_fdw the patch also included "some cosmetic changes such as removing useless blank lines."
But I do not find that blank lines is removed in general since I cannot read csv-files with blank lines, and I do not understand how the option "force_not_null" can do the trick since that is on the column level and not lines/row.

Any good ideas out there?

Thanks
Nicklas Avén

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Nicklas Avén (#1)
Re: How to ignore blank lines with file_fdw

Nicklas Avén wrote:

I have also found in an email from 2011
/messages/by-id/4E699DE6.8010606@gmail.com

that when force_not_null was implemented in file_fdw the patch also included "some cosmetic changes
such as removing useless blank lines."

That is refering to blank lines in the PostgreSQL source code,
not in the file processed :^)

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Ian Lawrence Barwick
barwick@gmail.com
In reply to: Nicklas Avén (#1)
Re: Fwd: How to ignore blank lines with file_fdw

On 22/04/14 21:09, Nicklas Avén wrote:

Hallo

I am struggling to find the best solution to ignore blank lines in
csv-file when using file_fdw.

A blank line makes the table unreadable.

I would like to avoid manipulating the file directly and avoid the
need to make a new corrected copy of the file.

I am on Linux so I have found a solution when using COPY:
COPY test_table from program 'sed ''/^ *$/d''
/opt/builds/inotify_test/test.csv' with (format 'csv', header
'true');

but since the "program" option not seems to be implemented in file_fdw
I am still searching for a solution.

file_fdw uses the same mechanism internally as "COPY <table> FROM '/file.csv'";
I don't think there's currently a way for this mechanism to ignore blank
lines.

Unfortunately CSV is not exactly a well-defined standard, so it's debatable
whether it's worth modifying the mechanism to cope with this situation.
The closest thing to a standard, RFC 4180 ( http://tools.ietf.org/html/rfc4180 )
doesn't seem to have anything to say about them; on the other hand LibreOffice
Calc will happily import files with blank lines.

I have also found in an email from 2011
/messages/by-id/4E699DE6.8010606@gmail.com

that when force_not_null was implemented in file_fdw the patch also
included "some cosmetic changes such as removing useless blank lines."
But I do not find that blank lines is removed in general since I
cannot read csv-files with blank lines, and I do not understand how
the option "force_not_null" can do the trick since that is on the
column level and not lines/row.

The "blank lines" referred to here are in the source code itself.

Regards

Ian Barwick

--
Ian Barwick http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general