COPY FROM is not 8bit clean

Started by Darcy Buskermolenalmost 24 years ago6 messages
#1Darcy Buskermolen
darcy@ok-connect.com

ACK!!!!! must rember which MTA I'm useing...
When useing COPY FROM 'file' DELIMITER '\254' copyfrom reads past the
delimiter and ends up with parse errors when trying to do the insert

What the ?? why dind' tthat go through with the body of the text.. *sigh*
I'll resend in the AM..

#2Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Darcy Buskermolen (#1)
Re: COPY FROM is not 8bit clean

When useing COPY FROM 'file' DELIMITER '\254' copyfrom reads past the
delimiter and ends up with parse errors when trying to do the insert

What the ?? why dind' tthat go through with the body of the text.. *sigh*
I'll resend in the AM..

Good catch. It's definitely a bug in copy command. Please try
following patches (this is against 7.2).

*** src/backend/commands/copy.c.orig	Tue Feb 26 21:11:05 2002
--- src/backend/commands/copy.c	Tue Feb 26 21:11:35 2002
***************
*** 1024,1030 ****
  CopyReadAttribute(FILE *fp, bool *isnull, char *delim, int *newline, char *null_print)
  {
  	int			c;
! 	int			delimc = delim[0];
  #ifdef MULTIBYTE
  	int			mblen;
--- 1024,1030 ----
  CopyReadAttribute(FILE *fp, bool *isnull, char *delim, int *newline, char *null_print)
  {
  	int			c;
! 	int			delimc = (unsigned char)delim[0];

#ifdef MULTIBYTE
int mblen;

#3Darcy Buskermolen
darcy@ok-connect.com
In reply to: Tatsuo Ishii (#2)
Re: COPY FROM is not 8bit clean

This patch solves the problem.

At 09:16 PM 2/26/02 +0900, Tatsuo Ishii wrote:

When useing COPY FROM 'file' DELIMITER '\254' copyfrom reads past the
delimiter and ends up with parse errors when trying to do the insert

What the ?? why dind' tthat go through with the body of the text.. *sigh*
I'll resend in the AM..

Good catch. It's definitely a bug in copy command. Please try
following patches (this is against 7.2).

*** src/backend/commands/copy.c.orig	Tue Feb 26 21:11:05 2002
--- src/backend/commands/copy.c	Tue Feb 26 21:11:35 2002
***************
*** 1024,1030 ****
CopyReadAttribute(FILE *fp, bool *isnull, char *delim, int *newline,

char *null_print)

{
int c;
! int delimc = delim[0];

#ifdef MULTIBYTE
int			mblen;
--- 1024,1030 ----
CopyReadAttribute(FILE *fp, bool *isnull, char *delim, int *newline,

char *null_print)

Show quoted text

{
int c;
! int delimc = (unsigned char)delim[0];

#ifdef MULTIBYTE
int mblen;

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Darcy Buskermolen (#3)
Re: COPY FROM is not 8bit clean

Can someone explain why this fixes the problem. I thought it was safe
to assign a char to an int and do a compare. The compare I see is:

if (c == delimc)
break;

---------------------------------------------------------------------------

Darcy Buskermolen wrote:

This patch solves the problem.

At 09:16 PM 2/26/02 +0900, Tatsuo Ishii wrote:

When useing COPY FROM 'file' DELIMITER '\254' copyfrom reads past the
delimiter and ends up with parse errors when trying to do the insert

What the ?? why dind' tthat go through with the body of the text.. *sigh*
I'll resend in the AM..

Good catch. It's definitely a bug in copy command. Please try
following patches (this is against 7.2).

*** src/backend/commands/copy.c.orig	Tue Feb 26 21:11:05 2002
--- src/backend/commands/copy.c	Tue Feb 26 21:11:35 2002
***************
*** 1024,1030 ****
CopyReadAttribute(FILE *fp, bool *isnull, char *delim, int *newline,

char *null_print)

{
int c;
! int delimc = delim[0];

#ifdef MULTIBYTE
int			mblen;
--- 1024,1030 ----
CopyReadAttribute(FILE *fp, bool *isnull, char *delim, int *newline,

char *null_print)

{
int c;
! int delimc = (unsigned char)delim[0];

#ifdef MULTIBYTE
int mblen;

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#4)
Re: COPY FROM is not 8bit clean

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Can someone explain why this fixes the problem.

Think about a machine where char is signed by default. Extracting \254
into an int will produce -2, which will not equal \254 returned by getc.

regards, tom lane

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#5)
Re: COPY FROM is not 8bit clean

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Can someone explain why this fixes the problem.

Think about a machine where char is signed by default. Extracting \254
into an int will produce -2, which will not equal \254 returned by getc.

Oh, I thought that the int returned by getc already had that sign
extension, but now I remember it doesn't. In fact, it specifically
returns an int so -1 can be identified. Got it. Seems I am forgetting
some of my C.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026