raw output from copy
Hi
This thread was finished without real work. I have a real use case - export
XML doc in non utf8 encoding.
/messages/by-id/16174.1319228878@sss.pgh.pa.us
I propose to implement new format option "RAW" like Tom proposed.
It requires only one row, one column result - and result is just raw binary
data without size.
Objections? Ideas?
Regards
Pavel
Hi
I wrote a prototype of this patch, and it works well
postgres=# set client_encoding to 'latin2';
SET
Time: 1.488 ms
postgres=# \copy (select xmlelement(name xx, d) from d) to ~/d.xml (format
'raw')
COPY 1
Time: 1.108 ms
postgres=# copy (select xmlelement(name xx, d) from d) to stdout (format
'raw') ;
<?xml version="1.0" encoding="LATIN2"?><xx>příliš žluťoučký kůň</xx>Time:
1.000 ms
Regards
Pavel
2015-04-09 20:48 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
Show quoted text
Hi
This thread was finished without real work. I have a real use case -
export XML doc in non utf8 encoding./messages/by-id/16174.1319228878@sss.pgh.pa.us
I propose to implement new format option "RAW" like Tom proposed.
It requires only one row, one column result - and result is just raw
binary data without size.Objections? Ideas?
Regards
Pavel
Attachments:
copy-raw.patchtext/x-patch; charset=US-ASCII; name=copy-raw.patchDownload+59-14
On 4/10/15 5:26 PM, Pavel Stehule wrote:
Hi
I wrote a prototype of this patch, and it works well
postgres=# set client_encoding to 'latin2';
SET
Time: 1.488 ms
postgres=# \copy (select xmlelement(name xx, d) from d) to ~/d.xml
(format 'raw')
COPY 1
Time: 1.108 ms
postgres=# copy (select xmlelement(name xx, d) from d) to stdout (format
'raw') ;
<?xml version="1.0" encoding="LATIN2"?><xx>příliš žluťoučký
kůň</xx>Time: 1.000 ms
I think you can get the same thing using regular psql output and just
turning off all field and record separators and tuple headers and so on.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
It would be nice, but it is not true. You can get correct non utf8 xml with
encoding specification only when binary mode is used. Psql doesn't support
binary mode.
Regards
Pavel
Dne 15. 4. 2015 22:06 napsal uživatel "Peter Eisentraut" <peter_e@gmx.net>:
Show quoted text
On 4/10/15 5:26 PM, Pavel Stehule wrote:
Hi
I wrote a prototype of this patch, and it works well
postgres=# set client_encoding to 'latin2';
SET
Time: 1.488 ms
postgres=# \copy (select xmlelement(name xx, d) from d) to ~/d.xml
(format 'raw')
COPY 1
Time: 1.108 ms
postgres=# copy (select xmlelement(name xx, d) from d) to stdout (format
'raw') ;
<?xml version="1.0" encoding="LATIN2"?><xx>příliš žluťoučký
kůň</xx>Time: 1.000 msI think you can get the same thing using regular psql output and just
turning off all field and record separators and tuple headers and so on.
Hello Pavel.
I looked through the patch. Sources are OK. However I didn't find any docs
and test cases. Would you please provide me with short description on this
feature and why it is important. Because I didn't manage to find the old
Andrew Dunstan's post either.
On Sat, Apr 11, 2015 at 12:26 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:
Hi
I wrote a prototype of this patch, and it works well
postgres=# set client_encoding to 'latin2';
SET
Time: 1.488 ms
postgres=# \copy (select xmlelement(name xx, d) from d) to ~/d.xml (format
'raw')
COPY 1
Time: 1.108 ms
postgres=# copy (select xmlelement(name xx, d) from d) to stdout (format
'raw') ;
<?xml version="1.0" encoding="LATIN2"?><xx>příliš žluťoučký kůň</xx>Time:
1.000 msRegards
Pavel
2015-04-09 20:48 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi
This thread was finished without real work. I have a real use case -
export XML doc in non utf8 encoding./messages/by-id/16174.1319228878@sss.pgh.pa.us
I propose to implement new format option "RAW" like Tom proposed.
It requires only one row, one column result - and result is just raw
binary data without size.Objections? Ideas?
Regards
Pavel
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Nullus est in vitae sensus ipsa vera est sensus.
On 1 July 2015 at 07:42, Pavel Golub <pavel@microolap.com> wrote:
I looked through the patch. Sources are OK. However I didn't find any
docs and test cases. Would you please provide me with short description on
this feature and why it is important. Because I didn't manage to find the
old Andrew Dunstan's post either.
Feature sounds OK, so lets do it.
Pavel S, please submit a polished patch. Coding guidelines, tests, docs
etc. Set back to Waiting On Author.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi
I'll do it today evening
Pavel
2015-07-02 12:55 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:
Show quoted text
On 1 July 2015 at 07:42, Pavel Golub <pavel@microolap.com> wrote:
I looked through the patch. Sources are OK. However I didn't find any
docs and test cases. Would you please provide me with short description on
this feature and why it is important. Because I didn't manage to find the
old Andrew Dunstan's post either.Feature sounds OK, so lets do it.
Pavel S, please submit a polished patch. Coding guidelines, tests, docs
etc. Set back to Waiting On Author.--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/02/2015 07:14 AM, Pavel Stehule wrote:
Hi
I'll do it today evening
Pavel,
Please don't top-post on the PostgreSQL lists. You've been around here
long enough to know that bottom posting is our custom.
I posted a patch for this in 2013 at
</messages/by-id/50F2FA92.9040000@dunslane.net> but
it can apply to a SELECT, and doesn't need COPY. Nobody seemed very
interested, so I dropped it. Apparently people now want something along
these lines, which is good.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 07/02/2015 09:02 AM, Andrew Dunstan wrote:
On 07/02/2015 07:14 AM, Pavel Stehule wrote:
Hi
I'll do it today evening
Pavel,
Please don't top-post on the PostgreSQL lists. You've been around here
long enough to know that bottom posting is our custom.I posted a patch for this in 2013 at
</messages/by-id/50F2FA92.9040000@dunslane.net>
but it can apply to a SELECT, and doesn't need COPY. Nobody seemed
very interested, so I dropped it. Apparently people now want something
along these lines, which is good.
For reference, here's the Wayback Machine's version of the original blog
post referred to:
<http://web.archive.org/web/20110916023912/http://people.planetpostgresql.org/andrew/index.php?/archives/196-Clever-trick-challenge.html>
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2 July 2015 at 14:02, Andrew Dunstan <andrew@dunslane.net> wrote:
Please don't top-post on the PostgreSQL lists. You've been around here
long enough to know that bottom posting is our custom.I posted a patch for this in 2013 at <
/messages/by-id/50F2FA92.9040000@dunslane.net> but
it can apply to a SELECT, and doesn't need COPY. Nobody seemed very
interested, so I dropped it. Apparently people now want something along
these lines, which is good.
It's a shame that both solutions are restricted to either COPY or psql.
Both of those are working on suggestions from Tom, so there is no history
of preference there.
Can we have both please, gentlemen?
If we implemented Andrew's solution, how would we request it in a COPY
statement? Seems like we would want the RAW format keyword anyway.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2015-07-02 15:43 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:
On 2 July 2015 at 14:02, Andrew Dunstan <andrew@dunslane.net> wrote:
Please don't top-post on the PostgreSQL lists. You've been around here
long enough to know that bottom posting is our custom.I posted a patch for this in 2013 at <
/messages/by-id/50F2FA92.9040000@dunslane.net> but
it can apply to a SELECT, and doesn't need COPY. Nobody seemed very
interested, so I dropped it. Apparently people now want something along
these lines, which is good.It's a shame that both solutions are restricted to either COPY or psql.
Both of those are working on suggestions from Tom, so there is no history
of preference there.Can we have both please, gentlemen?
If we implemented Andrew's solution, how would we request it in a COPY
statement? Seems like we would want the RAW format keyword anyway.
I prefer a COPY like solution - it can be used on both sides (server,
client), and it can be used little bit simply for psql -c "XXX" pattern.
Regards
Pavel
Show quoted text
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/02/2015 09:43 AM, Simon Riggs wrote:
On 2 July 2015 at 14:02, Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:Please don't top-post on the PostgreSQL lists. You've been around
here long enough to know that bottom posting is our custom.I posted a patch for this in 2013 at
</messages/by-id/50F2FA92.9040000@dunslane.net>
but it can apply to a SELECT, and doesn't need COPY. Nobody seemed
very interested, so I dropped it. Apparently people now want
something along these lines, which is good.It's a shame that both solutions are restricted to either COPY or psql.
Both of those are working on suggestions from Tom, so there is no
history of preference there.Can we have both please, gentlemen?
If we implemented Andrew's solution, how would we request it in a COPY
statement? Seems like we would want the RAW format keyword anyway.
What's the use case? My original motivation was that I had a function
that returned a bytea (it was a PDF in fact) that I wanted to be able to
write to a file. Of course, this is easy enough to do with a client
library like perl's DBD::Pg, but it seems sad to have to resort to that
for something so simple.
My original suggestion
(</messages/by-id/4EA1B83B.2050605@pgexperts.com>)
was to invent a \bcopy command.
I don't have a problem in building in a RAW mode for copy, but we'll
still need to teach psql how to deal with it.
Another case where it could be useful is JSON - so we can avoid having
to play tricks like
<http://adpgtech.blogspot.com/2014/09/importing-json-data.html>. Similar
considerations probably apply to XML, and the tricks are less guaranteed
to work.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2015-07-02 16:02 GMT+02:00 Andrew Dunstan <andrew@dunslane.net>:
On 07/02/2015 09:43 AM, Simon Riggs wrote:
On 2 July 2015 at 14:02, Andrew Dunstan <andrew@dunslane.net <mailto:
andrew@dunslane.net>> wrote:Please don't top-post on the PostgreSQL lists. You've been around
here long enough to know that bottom posting is our custom.I posted a patch for this in 2013 at
</messages/by-id/50F2FA92.9040000@dunslane.net>
but it can apply to a SELECT, and doesn't need COPY. Nobody seemed
very interested, so I dropped it. Apparently people now want
something along these lines, which is good.It's a shame that both solutions are restricted to either COPY or psql.
Both of those are working on suggestions from Tom, so there is no history
of preference there.Can we have both please, gentlemen?
If we implemented Andrew's solution, how would we request it in a COPY
statement? Seems like we would want the RAW format keyword anyway.What's the use case? My original motivation was that I had a function that
returned a bytea (it was a PDF in fact) that I wanted to be able to write
to a file. Of course, this is easy enough to do with a client library like
perl's DBD::Pg, but it seems sad to have to resort to that for something so
simple.My original suggestion (<
/messages/by-id/4EA1B83B.2050605@pgexperts.com>) was
to invent a \bcopy command.I don't have a problem in building in a RAW mode for copy, but we'll still
need to teach psql how to deal with it.
It can be used from psql without any problems.
Show quoted text
Another case where it could be useful is JSON - so we can avoid having to
play tricks like <
http://adpgtech.blogspot.com/2014/09/importing-json-data.html>. Similar
considerations probably apply to XML, and the tricks are less guaranteed to
work.cheers
andrew
On 2 July 2015 at 15:07, Pavel Stehule <pavel.stehule@gmail.com> wrote:
It can be used from psql without any problems.
It can, but your patch does not yet do that, while Andrew's does.
We want a solution that works from psql and other clients. Hopefully the
same-ish solution.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/02/2015 10:07 AM, Pavel Stehule wrote:
2015-07-02 16:02 GMT+02:00 Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>>:On 07/02/2015 09:43 AM, Simon Riggs wrote:
On 2 July 2015 at 14:02, Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net> <mailto:andrew@dunslane.net
<mailto:andrew@dunslane.net>>> wrote:Please don't top-post on the PostgreSQL lists. You've been
around
here long enough to know that bottom posting is our custom.I posted a patch for this in 2013 at
</messages/by-id/50F2FA92.9040000@dunslane.net>
but it can apply to a SELECT, and doesn't need COPY.
Nobody seemed
very interested, so I dropped it. Apparently people now want
something along these lines, which is good.It's a shame that both solutions are restricted to either COPY
or psql.Both of those are working on suggestions from Tom, so there is
no history of preference there.Can we have both please, gentlemen?
If we implemented Andrew's solution, how would we request it
in a COPY statement? Seems like we would want the RAW format
keyword anyway.What's the use case? My original motivation was that I had a
function that returned a bytea (it was a PDF in fact) that I
wanted to be able to write to a file. Of course, this is easy
enough to do with a client library like perl's DBD::Pg, but it
seems sad to have to resort to that for something so simple.My original suggestion
(</messages/by-id/4EA1B83B.2050605@pgexperts.com>)
was to invent a \bcopy command.I don't have a problem in building in a RAW mode for copy, but
we'll still need to teach psql how to deal with it.It can be used from psql without any problems.
In fact your patch will not work with psql's \copy nor to stdout at all,
unless I'm misreading it:
- if (cstate->binary)
+ if (cstate->binary || cstate->raw)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("COPY BINARY is not supported to stdout or from
stdin")));
So it looks like you're only supporting this where the server is writing
to a file. That's horribly narrow, and certainly doesn't meet my
original need.
Does the COPY line protocol even support binary data? If not, we're dead
in the water here from the psql POV. Because my patch doesn't use the
COPY protocol it doesn't have this problem.
Perhaps we should do both, although I'm not sure I understand the use
case for the COPY solution.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
Does the COPY line protocol even support binary data?
The protocol, per se, just transmits a byte stream. There is a field
in the CopyInResponse/CopyOutResponse messages that indicates whether
a text or binary copy is being done. One thing we'd have to consider
is whether "raw" mode is sufficiently different from binary to justify
an additional value for this field, and if so whether that constitutes
a protocol break.
IIRC, psql wouldn't really care; it just transfers the byte stream to or
from the target file, regardless of text or binary mode. But there might
be other client libraries that are smarter and expect "binary" mode to
mean the binary file format specified in the COPY reference page. So
there may be value in being explicit about "raw" mode in these messages.
A key point in all this is that people who need "raw" transfer probably
need it in both directions, a point that your SELECT proposal cannot
satisfy, but hacking COPY could. So I lean towards the latter really.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 07/02/2015 11:02 AM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
Does the COPY line protocol even support binary data?
The protocol, per se, just transmits a byte stream. There is a field
in the CopyInResponse/CopyOutResponse messages that indicates whether
a text or binary copy is being done. One thing we'd have to consider
is whether "raw" mode is sufficiently different from binary to justify
an additional value for this field, and if so whether that constitutes
a protocol break.IIRC, psql wouldn't really care; it just transfers the byte stream to or
from the target file, regardless of text or binary mode. But there might
be other client libraries that are smarter and expect "binary" mode to
mean the binary file format specified in the COPY reference page. So
there may be value in being explicit about "raw" mode in these messages.A key point in all this is that people who need "raw" transfer probably
need it in both directions, a point that your SELECT proposal cannot
satisfy, but hacking COPY could. So I lean towards the latter really.
OK, let's do that. I await the result with interest.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2015-07-02 17:02 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Andrew Dunstan <andrew@dunslane.net> writes:
Does the COPY line protocol even support binary data?
The protocol, per se, just transmits a byte stream. There is a field
in the CopyInResponse/CopyOutResponse messages that indicates whether
a text or binary copy is being done. One thing we'd have to consider
is whether "raw" mode is sufficiently different from binary to justify
an additional value for this field, and if so whether that constitutes
a protocol break. sql/plpgsql_check_passive-9.6.sql
IIRC, psql wouldn't really care; it just transfers the byte stream to or
from the target file, regardless of text or binary mode. But there might
be other client libraries that are smarter and expect "binary" mode to
mean the binary file format specified in the COPY reference page. So
there may be value in being explicit about "raw" mode in these messages.
The safe way is create new mode and propagate it on client. It should to
not break any current applications, because no one uses COPY RAW.
A key point in all this is that people who need "raw" transfer probably
need it in both directions, a point that your SELECT proposal cannot
satisfy, but hacking COPY could. So I lean towards the latter really.
yes, it has sense. I am not sure, if I'll have time to implement it in this
step, but I'll look on it.
regards
Pavel
Show quoted text
regards, tom lane
Hi
here is a version with both direction support.
postgres=# copy foo from '/tmp/1.jpg' (format raw);
COPY 1
Time: 93.021 ms
postgres=# \dt+ foo
List of relations
┌────────┬──────┬───────┬───────┬────────┬─────────────┐
│ Schema │ Name │ Type │ Owner │ Size │ Description │
╞════════╪══════╪═══════╪═══════╪════════╪═════════════╡
│ public │ foo │ table │ pavel │ 256 kB │ │
└────────┴──────┴───────┴───────┴────────┴─────────────┘
(1 row)
postgres=# \copy foo to '~/3.jpg' (format raw)
COPY 1
Time: 2.401 ms
Regards
Pavel
2015-07-02 17:02 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Show quoted text
Andrew Dunstan <andrew@dunslane.net> writes:
Does the COPY line protocol even support binary data?
The protocol, per se, just transmits a byte stream. There is a field
in the CopyInResponse/CopyOutResponse messages that indicates whether
a text or binary copy is being done. One thing we'd have to consider
is whether "raw" mode is sufficiently different from binary to justify
an additional value for this field, and if so whether that constitutes
a protocol break.IIRC, psql wouldn't really care; it just transfers the byte stream to or
from the target file, regardless of text or binary mode. But there might
be other client libraries that are smarter and expect "binary" mode to
mean the binary file format specified in the COPY reference page. So
there may be value in being explicit about "raw" mode in these messages.A key point in all this is that people who need "raw" transfer probably
need it in both directions, a point that your SELECT proposal cannot
satisfy, but hacking COPY could. So I lean towards the latter really.regards, tom lane
Attachments:
copy-raw-format-20150706-01.patchtext/x-patch; charset=US-ASCII; name=copy-raw-format-20150706-01.patchDownload+362-116
Hi
previous patch was broken, and buggy
Here is new version with fixed upload and more tests
The interesting is so I should not to modify interface or client - so it
should to work with any current driver with protocol support >= 3.
Regards
Pavel
2015-07-06 23:34 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
Show quoted text
Hi
here is a version with both direction support.
postgres=# copy foo from '/tmp/1.jpg' (format raw);
COPY 1
Time: 93.021 ms
postgres=# \dt+ foo
List of relations
┌────────┬──────┬───────┬───────┬────────┬─────────────┐
│ Schema │ Name │ Type │ Owner │ Size │ Description │
╞════════╪══════╪═══════╪═══════╪════════╪═════════════╡
│ public │ foo │ table │ pavel │ 256 kB │ │
└────────┴──────┴───────┴───────┴────────┴─────────────┘
(1 row)postgres=# \copy foo to '~/3.jpg' (format raw)
COPY 1
Time: 2.401 msRegards
Pavel
2015-07-02 17:02 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Andrew Dunstan <andrew@dunslane.net> writes:
Does the COPY line protocol even support binary data?
The protocol, per se, just transmits a byte stream. There is a field
in the CopyInResponse/CopyOutResponse messages that indicates whether
a text or binary copy is being done. One thing we'd have to consider
is whether "raw" mode is sufficiently different from binary to justify
an additional value for this field, and if so whether that constitutes
a protocol break.IIRC, psql wouldn't really care; it just transfers the byte stream to or
from the target file, regardless of text or binary mode. But there might
be other client libraries that are smarter and expect "binary" mode to
mean the binary file format specified in the COPY reference page. So
there may be value in being explicit about "raw" mode in these messages.A key point in all this is that people who need "raw" transfer probably
need it in both directions, a point that your SELECT proposal cannot
satisfy, but hacking COPY could. So I lean towards the latter really.regards, tom lane