plpgsql copy import csv double quotes
Hi, MailingList
PG 9.3 under Ubuntu 14.04 (I know, that’s obsolete, but we’re planning to move to pg 10 or more during 2020)
I’ve experience issues with double quotes \34 inside fields, in a csv file.
Ex :
"value1","some text","other text with "double quotes" inside","last field"
When I import this line in a table, with the exact column number, with the « copy » command in plpgslq, the import fails.
Saying that too many fields are present.
I tried to catch the double quotes and delete then, but that’s hard under AWK or sed
_________________________________
Cordialement, Pascal CROZET
DBA - [cid:image002.png@01D57EF8.10B0A960] <http://www.qualis-consulting.com> Qualis Consulting<http://www.qualis-consulting.com/>
• www.qualis-consulting.com<http://www.qualis-consulting.com/> • 04 78 22 74 90
• Le Bois des Côtes 1 – Bâtiment A
• 300 Route Nationale 6 – 69760 LIMONEST
_________________________________
GRANT and REVOKE trigger on a ddl_command_end event trigger but don't provide any information beyond whether it was a table, schema, function, etc. that was affected. No object IDs or the like are included. How would you find out which table had its ACLs modified?
Also, why do grants and revokes have an object_type of 'TABLE' instead of lower case names like 'table' for all other event types?
Thanks,
Miles Elam
Attachments:
On 10/9/19 2:20 PM, PASCAL CROZET wrote:
Hi, MailingList
PG 9.3 under Ubuntu 14.04 (I know, that�s obsolete, but we�re planning
to move to pg 10 or more during 2020)I�ve experience issues with double quotes \34 inside fields, in a csv file.
Ex�:
"value1","some text","other text with "double quotes" inside","last field"
When I import this line in a table, with the exact column number, with
the ��copy�� command in plpgslq, the import fails.
What is the actual command?
This is being done inside a plpgsql function, correct?
If inside a function what is the function code?
My guess is you need to use the CSV format to COPY:
https://www.postgresql.org/docs/11/sql-copy.html
Saying that too many fields are present.
I tried to catch the double quotes and delete then, but that�s hard
under AWK or sed*_________________________________*
Cordialement, *Pascal CROZET*
*DBA - *<http://www.qualis-consulting.com>Qualis Consulting
<http://www.qualis-consulting.com/>�www.qualis-consulting.com <http://www.qualis-consulting.com/>�04 78 22
74 90
�Le Bois des C�tes 1 � B�timent A
�300 Route Nationale 6 � 69760 LIMONEST
*_________________________________*GRANT and REVOKE trigger on a ddl_command_end event trigger but don't
provide any information beyond whether it was a table, schema, function,
etc. that was affected. No object IDs or the like are included. How
would you find out which table had its ACLs modified?Also, why do grants and revokes have an object_type of 'TABLE' instead
of lower case names like 'table' for all other event types?Thanks,
Miles Elam
--
Adrian Klaver
adrian.klaver@aklaver.com
PASCAL CROZET <pascal.crozet@qualis-consulting.com> writes:
Ive experience issues with double quotes \34 inside fields, in a csv file.
Ex :
"value1","some text","other text with "double quotes" inside","last field"
I don't know of any definition of CSV format by which that's legal data.
The typical rule is that double quotes that are data must be doubled;
at least, that's what COPY expects by default. You can also get COPY
to handle variants like backslash-quote.
regards, tom lane
Thanks,
As it's write in the page,<https://www.postgresql.org/docs/9.3/sql-copy.html> the file format is more a convention than a standard.
With collegues, we've find the solution with a regex in sed
sed -e 's/\([^,]\)"\([^,]\)/\1\2/g' -e 's/\([^,]\)"\([^,]\)/\1\2/g'
Because file contains values like
,"hostname1.fqdn.ad|\"\"\"\"\"\"\"\"\"\"0Clean|OK"
But, other csv file, that contains this value ","NAME=\"UBUNTU\"","| works well. I find this value in destination column |NAME=\UBUNTU\| in the destination table.
The main lines in the plpgsql function are :
CREATE OR REPLACE FUNCTION insert_into_db_sources_csv(
filename character varying,
tablename character varying,
delimiter character varying,
header character varying)
RETURNS void AS
....
request := 'TRUNCATE ' || tablename || '; COPY ' || tablename || ' FROM ''' || filename || ''' CSV ' || header || ' DELIMITER ''' || delimiter || ''' ENCODING ''UTF-8'';';
EXECUTE request;
The function call :
select insert_into_db_sources_csv('/DATA/input/files/Extract_software.csv', 't_m03_software', ',', 'HEADER');
If the import fails, TRUNCATE isn't executed. The previous data's that was in table remains the same.
_________________________________
Cordialement, Pascal CROZET
DBA
• www.qualis-consulting.com<http://www.qualis-consulting.com/> • 04 78 22 74 90
• Le Bois des Côtes 1 – Bâtiment A
• 300 Route Nationale 6 – 69760 LIMONEST
_________________________________
________________________________
De : Tom Lane <tgl@sss.pgh.pa.us>
Envoyé : jeudi 10 octobre 2019 00:31
À : PASCAL CROZET
Cc : PG-General Mailing List
Objet : Re: plpgsql copy import csv double quotes
PASCAL CROZET <pascal.crozet@qualis-consulting.com> writes:
I’ve experience issues with double quotes \34 inside fields, in a csv file.
Ex :
"value1","some text","other text with "double quotes" inside","last field"
I don't know of any definition of CSV format by which that's legal data.
The typical rule is that double quotes that are data must be doubled;
at least, that's what COPY expects by default. You can also get COPY
to handle variants like backslash-quote.
regards, tom lane
On 10/10/19 8:30 AM, PASCAL CROZET wrote:
Thanks,
As it's write in the page,
<https://www.postgresql.org/docs/9.3/sql-copy.html> the file format is
more a convention than a standard.With collegues, we've find the solution with a regex in sed
sed -e 's/\([^,]\)"\([^,]\)/\1\2/g' -e� 's/\([^,]\)"\([^,]\)/\1\2/g'
Because file contains values like
,"hostname1.fqdn.ad|\"\"\"\"\"\"\"\"\"\"0Clean|OK"
The issue is less the file format then the data value format. I am
struggling to figure out what the above is doing.
But, other csv file, that contains this value ","NAME=\"UBUNTU\"","|
works well. I find this value in destination column |NAME=\UBUNTU\| in
the destination table.The main lines in the plpgsql function are :
You might want to look at dollar quoting:
https://www.postgresql.org/docs/9.3/plpgsql-development-tips.html
40.11.1. Handling of Quotation Marks
If it where me I would separate out the commands below into distinct
EXECUTES, it would be easier to follow. That will still result in the
TRUNCATE being rolled back as it is part of the function transaction and:
https://www.postgresql.org/docs/11/sql-truncate.html
"TRUNCATE is transaction-safe with respect to the data in the tables:
the truncation will be safely rolled back if the surrounding transaction
does not commit."
If you want to deal with errors then:
https://www.postgresql.org/docs/9.3/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING
CREATE OR REPLACE FUNCTION insert_into_db_sources_csv(
��� filename character varying,
��� tablename character varying,
��� delimiter character varying,
��� header character varying)
� RETURNS void AS
....
request := 'TRUNCATE ' || tablename || '; COPY ' || tablename || ' FROM
''' || filename || ''' CSV ' || header || ' DELIMITER ''' || delimiter
|| ''' ENCODING ''UTF-8'';';
EXECUTE request;The function call :
select
insert_into_db_sources_csv('/DATA/input/files/Extract_software.csv',
't_m03_software', ',', 'HEADER');If the import fails, TRUNCATE isn't executed. The previous data's that
was in table remains the same.*_________________________________*
Cordialement, *Pascal CROZET**
**DBA *
�www.qualis-consulting.com <http://www.qualis-consulting.com/>�04 78 22
74 90�Le Bois des C�tes 1 � B�timent A
�300 Route Nationale 6 � 69760 LIMONEST
*_________________________________*------------------------------------------------------------------------
*De :* Tom Lane <tgl@sss.pgh.pa.us>
*Envoy� :* jeudi 10 octobre 2019 00:31
*� :* PASCAL CROZET
*Cc�:* PG-General Mailing List
*Objet :* Re: plpgsql copy import csv double quotes
PASCAL CROZET <pascal.crozet@qualis-consulting.com> writes:I�ve experience issues with double quotes \34 inside fields, in a csv file.
Ex :
"value1","some text","other text with "double quotes" inside","last field"I don't know of any definition of CSV format by which that's legal data.
The typical rule is that double quotes that are data must be doubled;
at least, that's what COPY expects by default.� You can also get COPY
to handle variants like backslash-quote.����������������������� regards, tom lane
--
Adrian Klaver
adrian.klaver@aklaver.com
Hoho 🙄🤔
Don't paid attention to the rest of the conversation that begins with "GRANT and REVOKE trigger on a ddl_command_end "
I've forwarded the email without deleting the previous message 😮😕😞
Salutations, Pascal
On Thu, Oct 10, 2019 at 9:20 PM +0200, "Adrian Klaver" <adrian.klaver@aklaver.com> wrote:
On 10/10/19 8:30 AM, PASCAL CROZET wrote:
Thanks,
As it's write in the page,
the file format is
more a convention than a standard.With collegues, we've find the solution with a regex in sed
sed -e 's/\([^,]\)"\([^,]\)/\1\2/g' -e 's/\([^,]\)"\([^,]\)/\1\2/g'
Because file contains values like
,"hostname1.fqdn.ad|\"\"\"\"\"\"\"\"\"\"0Clean|OK"
The issue is less the file format then the data value format. I am
struggling to figure out what the above is doing.
But, other csv file, that contains this value ","NAME=\"UBUNTU\"","|
works well. I find this value in destination column |NAME=\UBUNTU\| in
the destination table.The main lines in the plpgsql function are :
You might want to look at dollar quoting:
https://www.postgresql.org/docs/9.3/plpgsql-development-tips.html
40.11.1. Handling of Quotation Marks
If it where me I would separate out the commands below into distinct
EXECUTES, it would be easier to follow. That will still result in the
TRUNCATE being rolled back as it is part of the function transaction and:
https://www.postgresql.org/docs/11/sql-truncate.html
"TRUNCATE is transaction-safe with respect to the data in the tables:
the truncation will be safely rolled back if the surrounding transaction
does not commit."
If you want to deal with errors then:
https://www.postgresql.org/docs/9.3/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING
CREATE OR REPLACE FUNCTION insert_into_db_sources_csv(
filename character varying,
tablename character varying,
delimiter character varying,
header character varying)
RETURNS void AS
....
request := 'TRUNCATE ' || tablename || '; COPY ' || tablename || ' FROM
''' || filename || ''' CSV ' || header || ' DELIMITER ''' || delimiter
|| ''' ENCODING ''UTF-8'';';
EXECUTE request;The function call :
select
insert_into_db_sources_csv('/DATA/input/files/Extract_software.csv',
't_m03_software', ',', 'HEADER');If the import fails, TRUNCATE isn't executed. The previous data's that
was in table remains the same.*_________________________________*
Cordialement, *Pascal CROZET**
**DBA *
•www.qualis-consulting.com •04 78 22
74 90•Le Bois des Côtes 1 – Bâtiment A
•300 Route Nationale 6 – 69760 LIMONEST
*_________________________________*------------------------------------------------------------------------
*De :* Tom Lane
*Envoyé :* jeudi 10 octobre 2019 00:31
*À :* PASCAL CROZET
*Cc :* PG-General Mailing List
*Objet :* Re: plpgsql copy import csv double quotes
PASCAL CROZET writes:I’ve experience issues with double quotes \34 inside fields, in a csv file.
Ex :
"value1","some text","other text with "double quotes" inside","last field"I don't know of any definition of CSV format by which that's legal data.
The typical rule is that double quotes that are data must be doubled;
at least, that's what COPY expects by default. You can also get COPY
to handle variants like backslash-quote.regards, tom lane
--
Adrian Klaver
adrian.klaver@aklaver.com