COPY encoding

Started by Andrew Dunstanalmost 18 years ago4 messages
#1Andrew Dunstan
andrew@dunslane.net

In helping someome on IRC it has become apparent that unless I am
mistaken "COPY foo from 'filename'" is reading the file according to the
client encoding.

Is that the expected behaviour? The client might have no influence at
all on the contents of the file. Offhand, I would have said that a
server-resident file should be interpreted in the database encoding. Of
course, the client could change its encoding to influence how the file
is interpreted, but that seems rather kludgy. Maybe we really need an
encoding parameter to the COPY command.

At the very least we should document the encoding behaviour.

cheers

andrew

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#1)
Re: COPY encoding

Andrew Dunstan <andrew@dunslane.net> writes:

In helping someome on IRC it has become apparent that unless I am
mistaken "COPY foo from 'filename'" is reading the file according to the
client encoding.

Is that the expected behaviour?

Yes, it is. Not sure if it's adequately documented.

regards, tom lane

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#2)
Re: COPY encoding

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

In helping someome on IRC it has become apparent that unless I am
mistaken "COPY foo from 'filename'" is reading the file according to the
client encoding.

Is that the expected behaviour?

Yes, it is. Not sure if it's adequately documented.

Will this cover the case?

diff -c -r1.80 copy.sgml
*** copy.sgml   18 Apr 2007 02:28:22 -0000      1.80
--- copy.sgml   16 Jan 2008 20:44:02 -0000
***************
*** 363,368 ****
--- 363,376 ----
      happened well into a large copy operation. You might wish to invoke
      <command>VACUUM</command> to recover the wasted space.
     </para>
+
+    <para>
+     Input data is interpreted according to the current client encoding,
+     and output data is encoded in the the current client encoding, even
+     if the data does not pass through the client but is read from or
+     written to a file.
+    </para>
+
   </refsect1>

<refsect1>

cheers

andrew

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#3)
Re: COPY encoding

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Yes, it is. Not sure if it's adequately documented.

Will this cover the case?

Text looks OK. I think it might fit better a bit further up, adjacent
to the para about DateStyle which is a somewhat comparable
consideration.

regards, tom lane