libpq compression
Hi hackers,
One of our customers was managed to improve speed about 10 times by
using SSL compression for the system where client and servers are
located in different geographical regions
and query results are very large because of JSON columns. Them actually
do not need encryption, just compression.
I expect that it is not the only case where compression of libpq
protocol can be useful. Please notice that Postgres replication is also
using libpq protocol.
Taken in account that vulnerability was found in SSL compression and so
SSLComppression is considered to be deprecated and insecure
(http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html),
it will be nice to have some alternative mechanism of reducing libpq
traffic.
I have implemented some prototype implementation of it (patch is attached).
To use zstd compression, Postgres should be configured with --with-zstd.
Otherwise compression will use zlib unless it is disabled by
--without-zlib option.
I have added compression=on/off parameter to connection string and -Z
option to psql and pgbench utilities.
Below are some results:
Compression ratio (raw->compressed):
libz (level=1)
libzstd (level=1)
pgbench -i -s 10
16997209->2536330
16997209->268077
pgbench -t 100000 -S
6289036->1523862
6600338<-900293
6288933->1777400
6600338<-1000318
There is no mistyping: libzstd compress COPY data about 10 times better
than libz, with wonderful compression ratio 63.
Influence on execution time is minimal (I have tested local
configuration when client and server are at the same host):
no compression
libz (level=1)
libzstd (level=1)
pgbench -i -s 10
1.552
1.572
1.611
pgbench -t 100000 -S
4.482
4.926
4.877
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
libpq-compression-2.patchtext/x-patch; name=libpq-compression-2.patchDownload+665-78
On 30 March 2018 at 14:53, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
Hi hackers,
One of our customers was managed to improve speed about 10 times by using
SSL compression for the system where client and servers are located in
different geographical regions
and query results are very large because of JSON columns. Them actually
do not need encryption, just compression.
I expect that it is not the only case where compression of libpq protocol
can be useful. Please notice that Postgres replication is also using libpq
protocol.
Taken in account that vulnerability was found in SSL compression and so
SSLComppression is considered to be deprecated and insecure (
http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html),
it will be nice to have some alternative mechanism of reducing libpq
traffic.
I have implemented some prototype implementation of it (patch is
attached).
To use zstd compression, Postgres should be configured with --with-zstd.
Otherwise compression will use zlib unless it is disabled by --without-zlib
option.
I have added compression=on/off parameter to connection string and -Z
option to psql and pgbench utilities.
I'm a bit confused why there was no reply to this. I mean, it wasn't sent on
1st April, the patch still can be applied on top of the master branch and
looks
like it even works.
I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the data
stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering why it
didn't start at least a discussion about how it can be implemented. Do I
miss
something?
On 15.05.2018 13:23, Dmitry Dolgov wrote:
On 30 March 2018 at 14:53, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:
Hi hackers,
One of our customers was managed to improve speed about 10 times byusing SSL compression for the system where client and servers are
located in different geographical regionsand query results are very large because of JSON columns. Them
actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq
protocol can be useful. Please notice that Postgres replication is
also using libpq protocol.Taken in account that vulnerability was found in SSL compression and
so SSLComppression is considered to be deprecated and insecure
(http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html),
it will be nice to have some alternative mechanism of reducing libpq
traffic.I have implemented some prototype implementation of it (patch is
attached).
To use zstd compression, Postgres should be configured with
--with-zstd. Otherwise compression will use zlib unless it is disabled
by --without-zlib option.I have added compression=on/off parameter to connection string and
-Z option to psql and pgbench utilities.
I'm a bit confused why there was no reply to this. I mean, it wasn't
sent on
1st April, the patch still can be applied on top of the master branch
and looks
like it even works.I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the
data stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering
why it
didn't start at least a discussion about how it can be implemented. Do
I miss
something?
Implementation of libpq compression will be included in next release of
PgProEE.
Looks like community is not so interested in this patch. Frankly
speaking I do not understand why.
Compression of libpq traffic can significantly increase speed of:
1. COPY
2. Replication (both streaming and logical)
3. Queries returning large results sets (for example JSON) through slow
connections.
It is possible to compress libpq traffic using SSL compression. But SSL
compression is unsafe and deteriorated feature.
Yes, this patch is not extensible: it can use either zlib either zstd.
Unfortunately internal Postgres compression pglz doesn't provide
streaming API.
May be it is good idea to combine it with Ildus patch (custom
compression methods): https://commitfest.postgresql.org/18/1294/
In this case it will be possible to use any custom compression
algorithm. But we need to design and implement streaming API for pglz
and other compressors.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 05/15/2018 08:53 AM, Konstantin Knizhnik wrote:
On 15.05.2018 13:23, Dmitry Dolgov wrote:
On 30 March 2018 at 14:53, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:
Hi hackers,
One of our customers was managed to improve speed about 10 times byusing SSL compression for the system where client and servers are
located in different geographical regionsand query results are very large because of JSON columns. Them
actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq
protocol can be useful. Please notice that Postgres replication is
also using libpq protocol.Taken in account that vulnerability was found in SSL compression
and so SSLComppression is considered to be deprecated and insecure
(http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html),
it will be nice to have some alternative mechanism of reducing libpq
traffic.I have implemented some prototype implementation of it (patch is
attached).
To use zstd compression, Postgres should be configured with
--with-zstd. Otherwise compression will use zlib unless it is
disabled by --without-zlib option.I have added compression=on/off parameter to connection string and
-Z option to psql and pgbench utilities.
I'm a bit confused why there was no reply to this. I mean, it wasn't
sent on
1st April, the patch still can be applied on top of the master branch
and looks
like it even works.I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the
data stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering
why it
didn't start at least a discussion about how it can be implemented.
Do I miss
something?Implementation of libpq compression will be included in next release
of PgProEE.
Looks like community is not so interested in this patch. Frankly
speaking I do not understand why.
Compression of libpq traffic can significantly increase speed of:
1. COPY
2. Replication (both streaming and logical)
3. Queries returning large results sets (for example JSON) through
slow connections.It is possible to compress libpq traffic using SSL compression. But
SSL compression is unsafe and deteriorated feature.Yes, this patch is not extensible: it can use either zlib either zstd.
Unfortunately internal Postgres compression pglz doesn't provide
streaming API.
May be it is good idea to combine it with Ildus patch (custom
compression methods): https://commitfest.postgresql.org/18/1294/
In this case it will be possible to use any custom compression
algorithm. But we need to design and implement streaming API for pglz
and other compressors.
I'm sure there is plenty of interest in this. However, you guys need to
understand where we are in the development cycle. We're trying to wrap
up Postgres 11, which was feature frozen before this patch ever landed.
So it's material for Postgres 12. That means it will probably need to
wait a little while before it gets attention. It doesn't mean nobody is
interested.
I disagree with Dmitry about compressing in both directions - I can
think of plenty of good cases where we would want to compress traffic
from the client.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 15 May 2018 at 21:36, Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
wrote:
To use zstd compression, Postgres should be configured with
--with-zstd. Otherwise compression will use zlib unless it is disabled by
--without-zlib option.I have added compression=on/off parameter to connection string and -Z
option to psql and pgbench utilities.
I'm a bit confused why there was no reply to this. I mean, it wasn't
sent on
1st April, the patch still can be applied on top of the master branch
and looks
like it even works.I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the data
stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering why
it
didn't start at least a discussion about how it can be implemented. Do I
miss
something?Implementation of libpq compression will be included in next release of
PgProEE.
Looks like community is not so interested in this patch. Frankly speaking
I do not understand why.
I'm definitely very interested, and simply missed the post.
I'll talk with some team mates as we're doing some PG12 planning now.
Yes, this patch is not extensible: it can use either zlib either zstd.
Unfortunately internal Postgres compression pglz doesn't provide streaming
API.
May be it is good idea to combine it with Ildus patch (custom compression
methods): https://commitfest.postgresql.org/18/1294/
Given the history of issues with attempting custom/pluggable compression
for toast etc, I really wouldn't want to couple those up.
pglz wouldn't make much sense for protocol compression anyway, except maybe
for fast local links where it was worth a slight compression overhead but
not the cpu needed for gzip. I don't think it's too exciting. zlib/gzip is
likely the sweet spot for the reasonable future for protocol compression,
or a heck of a lot better than what we have, anyway.
We should make sure the protocol part is extensible, but the implementation
doesn't need to be pluggable.
In this case it will be possible to use any custom compression algorithm.
But we need to design and implement streaming API for pglz and other
compressors.I'm sure there is plenty of interest in this. However, you guys need to
understand where we are in the development cycle. We're trying to wrap up
Postgres 11, which was feature frozen before this patch ever landed. So
it's material for Postgres 12. That means it will probably need to wait a
little while before it gets attention. It doesn't mean nobody is interested.I disagree with Dmitry about compressing in both directions - I can think
of plenty of good cases where we would want to compress traffic from the
client.
Agreed. The most obvious case being COPY, but there's also big bytea
values, etc.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
2018-05-15 9:53 GMT-03:00 Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:
Looks like community is not so interested in this patch. Frankly speaking I
do not understand why.
AFAICS the lack of replies is due to feature freeze. I'm pretty sure
people are interested in this topic (at least I am). Did you review a
previous discussion [1]/messages/by-id/4FD9698F.2090407@timbira.com about this?
I did a prototype a few years ago. I didn't look at your patch yet.
I'll do in a few weeks. Please add your patch to the next CF [2]https://commitfest.postgresql.org/18/.
[1]: /messages/by-id/4FD9698F.2090407@timbira.com
[2]: https://commitfest.postgresql.org/18/
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Hello!
I have noticed that psql --help lack -Z|--compression option.
Also it would be nice to have option like --compression-level in psql
and pgbench.
On 03/30/2018 03:53 PM, Konstantin Knizhnik wrote:
Hi hackers,
One of our customers was managed to improve speed about 10 times by
using SSL compression for the system where client and servers are
located in different geographical regions
and query results are very large because of JSON columns. Them
actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq
protocol can be useful. Please notice that Postgres replication is
also using libpq protocol.Taken in account that vulnerability was found in SSL compression and
so SSLComppression is considered to be deprecated and insecure
(http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html),
it will be nice to have some alternative mechanism of reducing libpq
traffic.I have implemented some prototype implementation of it (patch is
attached).
To use zstd compression, Postgres should be configured with
--with-zstd. Otherwise compression will use zlib unless it is disabled
by --without-zlib option.
I have added compression=on/off parameter to connection string and -Z
option to psql and pgbench utilities.
Below are some results:Compression ratio (raw->compressed):
libz (level=1)
libzstd (level=1)
pgbench -i -s 10
16997209->2536330
16997209->268077
pgbench -t 100000 -S
6289036->1523862
6600338<-900293
6288933->1777400
6600338<-1000318There is no mistyping: libzstd compress COPY data about 10 times
better than libz, with wonderful compression ratio 63.Influence on execution time is minimal (I have tested local
configuration when client and server are at the same host):no compression
libz (level=1)
libzstd (level=1)
pgbench -i -s 10
1.552
1.572
1.611
pgbench -t 100000 -S
4.482
4.926
4.877--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 16.05.2018 18:09, Grigory Smolkin wrote:
Hello!
I have noticed that psql --help lack -Z|--compression option.
Also it would be nice to have option like --compression-level in psql
and pgbench.
Thank you for this notice.
Updated and rebased patch is attached.
Concerning specification of compression level: I have made many
experiments with different data sets and both zlib/zstd and in both
cases using compression level higher than default doesn't cause some
noticeable increase of compression ratio, but quite significantly reduce
speed. Moreover, for "pgbench -i" zstd provides better compression ratio
(63 times!) with compression level 1 than with with largest recommended
compression level 22! This is why I decided not to allow user to choose
compression level.
Attachments:
libpq-compression-3.patchtext/x-patch; name=libpq-compression-3.patchDownload+666-78
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
Thank you for this notice.
Updated and rebased patch is attached.
Hi Konstantin,
Seems very useful. +1.
+ rc = inflate(&zs->rx, Z_SYNC_FLUSH);
+ if (rc != Z_OK)
+ {
+ return ZPQ_DECOMPRESS_ERROR;
+ }
Does this actually guarantee that zs->rx.msg is set to a string? I
looked at some documentation here:
https://www.zlib.net/manual.html
It looks like return value Z_DATA_ERROR means that msg is set, but for
the other error codes Z_STREAM_ERROR, Z_BUF_ERROR, Z_MEM_ERROR it
doesn't explicitly say that. From a casual glance at
https://github.com/madler/zlib/blob/master/inflate.c I think it might
be set to Z_NULL and then never set to a string except in the mode =
BAD paths that produce the Z_DATA_ERROR return code. That's
interesting because later we do this:
+ if (r == ZPQ_DECOMPRESS_ERROR)
+ {
+ ereport(COMMERROR,
+ (errcode_for_socket_access(),
+ errmsg("Failed to decompress data: %s", zpq_error(PqStream))));
+ return EOF;
... where zpq_error() returns zs->rx.msg. That might crash or show
"(null)" depending on libc.
Also, message style: s/F/f/
+ssize_t zpq_read(ZpqStream* zs, void* buf, size_t size, size_t* processed)
Code style: We write "Type *foo", not "Type* var". We put the return
type of a function definition on its own line.
It looks like there is at least one place where zpq_stream.o's symbols
are needed but it isn't being linked in, so the build fails in some
ecpg stuff reached by make check-world:
gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -O2 -pthread -D_REENTRANT
-D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS test1.o
-L../../../../../src/port -L../../../../../src/common -L../../ecpglib
-lecpg -L../../pgtypeslib -lpgtypes
-L../../../../../src/interfaces/libpq -lpq -Wl,--as-needed
-Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags -lpgcommon
-lpgport -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm -o test1
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_free'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_error'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_read'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to
`zpq_buffered'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to
`zpq_create'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_write'
--
Thomas Munro
http://www.enterprisedb.com
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
Concerning specification of compression level: I have made many experiments
with different data sets and both zlib/zstd and in both cases using
compression level higher than default doesn't cause some noticeable increase
of compression ratio, but quite significantly reduce speed. Moreover, for
"pgbench -i" zstd provides better compression ratio (63 times!) with
compression level 1 than with with largest recommended compression level 22!
This is why I decided not to allow user to choose compression level.
Speaking of configuration, are you planning to support multiple
compression libraries at the same time? It looks like the current
patch implicitly requires client and server to use the same configure
option, without any attempt to detect or negotiate. Do I guess
correctly that a library mismatch would produce an incomprehensible
corrupt stream message?
--
Thomas Munro
http://www.enterprisedb.com
On Tue, Jun 05, 2018 at 06:04:21PM +1200, Thomas Munro wrote:
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik
Speaking of configuration, are you planning to support multiple
compression libraries at the same time? It looks like the current
patch implicitly requires client and server to use the same configure
option, without any attempt to detect or negotiate. Do I guess
correctly that a library mismatch would produce an incomprehensible
corrupt stream message?
I just had a quick look at this patch, lured by the smell of your latest
messages... And it seems to me that this patch needs a heavy amount of
work as presented. There are a couple of things which are not really
nice, like forcing the presentation of the compression option in the
startup packet to begin with. The high-jacking around secure_read() is
not nice either as it is aimed at being a rather high-level API on top
of the method used with the backend. On top of adding some
documentation, I think that you could get some inspiration from the
recent GSSAPI encription patch which has been submitted again for v12
cycle, which has spent a large amount of time designing its set of
options.
--
Michael
On 05.06.2018 08:26, Thomas Munro wrote:
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:Thank you for this notice.
Updated and rebased patch is attached.Hi Konstantin,
Seems very useful. +1.
+ rc = inflate(&zs->rx, Z_SYNC_FLUSH); + if (rc != Z_OK) + { + return ZPQ_DECOMPRESS_ERROR; + }Does this actually guarantee that zs->rx.msg is set to a string? I
looked at some documentation here:https://www.zlib.net/manual.html
It looks like return value Z_DATA_ERROR means that msg is set, but for
the other error codes Z_STREAM_ERROR, Z_BUF_ERROR, Z_MEM_ERROR it
doesn't explicitly say that. From a casual glance at
https://github.com/madler/zlib/blob/master/inflate.c I think it might
be set to Z_NULL and then never set to a string except in the mode =
BAD paths that produce the Z_DATA_ERROR return code. That's
interesting because later we do this:+ if (r == ZPQ_DECOMPRESS_ERROR) + { + ereport(COMMERROR, + (errcode_for_socket_access(), + errmsg("Failed to decompress data: %s", zpq_error(PqStream)))); + return EOF;... where zpq_error() returns zs->rx.msg. That might crash or show
"(null)" depending on libc.Also, message style: s/F/f/
+ssize_t zpq_read(ZpqStream* zs, void* buf, size_t size, size_t* processed)
Code style: We write "Type *foo", not "Type* var". We put the return
type of a function definition on its own line.It looks like there is at least one place where zpq_stream.o's symbols
are needed but it isn't being linked in, so the build fails in some
ecpg stuff reached by make check-world:gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -O2 -pthread -D_REENTRANT
-D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS test1.o
-L../../../../../src/port -L../../../../../src/common -L../../ecpglib
-lecpg -L../../pgtypeslib -lpgtypes
-L../../../../../src/interfaces/libpq -lpq -Wl,--as-needed
-Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags -lpgcommon
-lpgport -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm -o test1
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_free'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_error'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_read'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to
`zpq_buffered'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to
`zpq_create'
../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_write'
Hi Thomas,
Thank you for review. Updated version of the patch fixing all reported
problems is attached.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
libpq-compression-4.patchtext/x-patch; name=libpq-compression-4.patchDownload+707-81
On 05.06.2018 09:04, Thomas Munro wrote:
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:Concerning specification of compression level: I have made many experiments
with different data sets and both zlib/zstd and in both cases using
compression level higher than default doesn't cause some noticeable increase
of compression ratio, but quite significantly reduce speed. Moreover, for
"pgbench -i" zstd provides better compression ratio (63 times!) with
compression level 1 than with with largest recommended compression level 22!
This is why I decided not to allow user to choose compression level.Speaking of configuration, are you planning to support multiple
compression libraries at the same time? It looks like the current
patch implicitly requires client and server to use the same configure
option, without any attempt to detect or negotiate. Do I guess
correctly that a library mismatch would produce an incomprehensible
corrupt stream message?
Frankly speaking I am not sure that support of multiple compression
libraries at the same time is actually needed.
If we build Postgres from sources, then both frontend and backend
libraries will use the same compression library.
zlib is available almost everywhere and Postgres in any case is using it.
zstd is faster and provides better compression ratio. So in principle it
may be useful to try first to use zstd and if it is not available then
use zlib.
It will require dynamic loading of this libraries.
libpq stream compression is not the only place where compression is used
in Postgres. So I think that the problem of choosing compression algorithm
and supporting custom compression methods should be fixed at some upper
level. There is patch for custom compression method at commit fest.
May be it should be combined with this one.
Right now if client and server libpq libraries were built with different
compression libraries, then decompress error will be reported.
Supporting multiple compression methods will require more sophisticated
handshake protocol so that client and server can choose compression
method which is supported by both of them.
But certainly it can be done.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 05.06.2018 10:09, Michael Paquier wrote:
On Tue, Jun 05, 2018 at 06:04:21PM +1200, Thomas Munro wrote:
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik
Speaking of configuration, are you planning to support multiple
compression libraries at the same time? It looks like the current
patch implicitly requires client and server to use the same configure
option, without any attempt to detect or negotiate. Do I guess
correctly that a library mismatch would produce an incomprehensible
corrupt stream message?I just had a quick look at this patch, lured by the smell of your latest
messages... And it seems to me that this patch needs a heavy amount of
work as presented. There are a couple of things which are not really
nice, like forcing the presentation of the compression option in the
startup packet to begin with. The high-jacking around secure_read() is
not nice either as it is aimed at being a rather high-level API on top
of the method used with the backend. On top of adding some
documentation, I think that you could get some inspiration from the
recent GSSAPI encription patch which has been submitted again for v12
cycle, which has spent a large amount of time designing its set of
options.
--
Michael
Thank you for feedback,
I have considered this patch mostly as prototype to estimate efficiency
of libpq� protocol compression and compare it with SSL compression.
So I agree with you that there are a lot of things which should be improved.
But can you please clarify what is wrong with "forcing the presentation
of the compression option in the startup packet to begin with"?
Do you mean that it will be better to be able switch on/off compression
during session?
Also I do not completely understand what do you mean by "high-jacking
around secure_read()".
I looked at GSSAPI patch. It does injection in secure_read:
+#ifdef ENABLE_GSS
+��� if (port->gss->enc)
+��� {
+��� ��� n = be_gssapi_read(port, ptr, len);
+��� ��� waitfor = WL_SOCKET_READABLE;
+��� }
+��� else
But the main difference between encryption and compression is that
encryption is not changing data size, while compression does.
To be able to use streaming compression, I need to specify some function
for reading data from the stream. I am using secure_read for this purpose:
�� ��� PqStream = zpq_create((zpq_tx_func)secure_write,
(zpq_rx_func)secure_read, MyProcPort);
Can you please explain what is the problem with it?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 6/5/18 03:09, Michael Paquier wrote:
I just had a quick look at this patch, lured by the smell of your latest
messages... And it seems to me that this patch needs a heavy amount of
work as presented. There are a couple of things which are not really
nice, like forcing the presentation of the compression option in the
startup packet to begin with.
Yeah, at this point we will probably need a discussion and explanation
of the protocol behavior this is adding, such as how to negotiate
different compression settings.
Unrelatedly, I suggest skipping the addition of -Z options to various
client-side tools. This is unnecessary, since generic connection
options can already be specified via -d typically, and it creates
confusion because -Z is already used to specify output compression by
some programs.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 5 June 2018 at 13:06, Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
wrote:
On 6/5/18 03:09, Michael Paquier wrote:
I just had a quick look at this patch, lured by the smell of your latest
messages... And it seems to me that this patch needs a heavy amount of
work as presented. There are a couple of things which are not really
nice, like forcing the presentation of the compression option in the
startup packet to begin with.Yeah, at this point we will probably need a discussion and explanation
of the protocol behavior this is adding, such as how to negotiate
different compression settings.Unrelatedly, I suggest skipping the addition of -Z options to various
client-side tools. This is unnecessary, since generic connection
options can already be specified via -d typically, and it creates
confusion because -Z is already used to specify output compression by
some programs.
As the maintainer of the JDBC driver I would think we would also like to
leverage this as well.
There are a few other drivers that implement the protocol as well and I'm
sure they would want in as well.
I haven't looked at the patch but if we get to the point of negotiating
compression please let me know.
Thanks,
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
On Wed, Jun 6, 2018 at 2:06 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
Thank you for review. Updated version of the patch fixing all reported
problems is attached.
Small problem on Windows[1]https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1106:
C:\projects\postgresql\src\include\common/zpq_stream.h(17): error
C2143: syntax error : missing ')' before '*'
[C:\projects\postgresql\libpq.vcxproj]
2395
You used ssize_t in zpq_stream.h, but Windows doesn't have that type.
We have our own typedef in win32_port.h. Perhaps zpq_stream.c should
include postgres.h/postgres_fe.h (depending on FRONTEND) like the
other .c files in src/common, before it includes zpq_stream.h?
Instead of "c.h".
[1]: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1106
--
Thomas Munro
http://www.enterprisedb.com
On Tue, Jun 05, 2018 at 06:58:42PM +0300, Konstantin Knizhnik wrote:
I have considered this patch mostly as prototype to estimate efficiency of
libpq protocol compression and compare it with SSL compression.
So I agree with you that there are a lot of things which should be
improved.
Cool. It seems that there is some meaning for such a feature with
environments with spare CPU and network limitations.
But can you please clarify what is wrong with "forcing the presentation of
the compression option in the startup packet to begin with"?
Sure, I am referring to that in your v4:
if (conn->replication && conn->replication[0])
ADD_STARTUP_OPTION("replication", conn->replication);
+ if (conn->compression && conn->compression[0])
+ ADD_STARTUP_OPTION("compression", conn->compression);
There is no point in adding that as a mandatory field of the startup
packet.
Do you mean that it will be better to be able switch on/off compression
during session?
Not really, I get that this should be defined when the session is
established and remain until the session finishes. You have a couple of
restrictions like what to do with the first set of messages exchanged
but that could be delayed until the negotiation is done.
But the main difference between encryption and compression is that
encryption is not changing data size, while compression does.
To be able to use streaming compression, I need to specify some function for
reading data from the stream. I am using secure_read for this purpose:PqStream = zpq_create((zpq_tx_func)secure_write,
(zpq_rx_func)secure_read, MyProcPort);Can you please explain what is the problem with it?
Likely I have not looked at your patch sufficiently, but the point I am
trying to make is that secure_read or pqsecure_read are entry points
which switch method depending on the connection details. The GSSAPI
encryption patch does that. Yours does not with stuff like that:
retry4:
- nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd,
- conn->inBufSize - conn->inEnd);
This makes the whole error handling more consistent, and the new option
layer as well more consistent with what happens in SSL, except that you
want to be able to combine SSL and compression as well so you need an
extra process which decompresses/compresses the data after doing a "raw"
or "ssl" read/write. I have not actually looked much at your patch, but
I am wondering if it could not be possible to make the whole footprint
less invasive which really worries me as now shaped. As you need to
register functions with say zpq_create(), it would be instinctively
nicer to do the handling directly in secure_read() and such.
Just my 2c.
--
Michael
On 06.06.2018 10:53, Michael Paquier wrote:
On Tue, Jun 05, 2018 at 06:58:42PM +0300, Konstantin Knizhnik wrote:
I have considered this patch mostly as prototype to estimate efficiency of
libpq protocol compression and compare it with SSL compression.
So I agree with you that there are a lot of things which should be
improved.Cool. It seems that there is some meaning for such a feature with
environments with spare CPU and network limitations.But can you please clarify what is wrong with "forcing the presentation of
the compression option in the startup packet to begin with"?Sure, I am referring to that in your v4: if (conn->replication && conn->replication[0]) ADD_STARTUP_OPTION("replication", conn->replication); + if (conn->compression && conn->compression[0]) + ADD_STARTUP_OPTION("compression", conn->compression); There is no point in adding that as a mandatory field of the startup packet.
Sorry, but ADD_STARTUP_OPTION is not adding manatory field of startup
package. This option any be omitted.
There are a lto of other options registered using ADD_STARTUP_OPTION,
for example all environment-driven GUCs:
��� /* Add any environment-driven GUC settings needed */
��� for (next_eo = options; next_eo->envName; next_eo++)
��� {
��� ��� if ((val = getenv(next_eo->envName)) != NULL)
��� ��� {
��� ��� ��� if (pg_strcasecmp(val, "default") != 0)
��� ��� ��� ��� ADD_STARTUP_OPTION(next_eo->pgName, val);
��� ��� }
��� }
So I do not understand what is wrong here registering "compression" as
option of startup package and what is the alternative for it...
Do you mean that it will be better to be able switch on/off compression
during session?Not really, I get that this should be defined when the session is
established and remain until the session finishes. You have a couple of
restrictions like what to do with the first set of messages exchanged
but that could be delayed until the negotiation is done.But the main difference between encryption and compression is that
encryption is not changing data size, while compression does.
To be able to use streaming compression, I need to specify some function for
reading data from the stream. I am using secure_read for this purpose:�� ��� PqStream = zpq_create((zpq_tx_func)secure_write,
(zpq_rx_func)secure_read, MyProcPort);Can you please explain what is the problem with it?
Likely I have not looked at your patch sufficiently, but the point I am
trying to make is that secure_read or pqsecure_read are entry points
which switch method depending on the connection details. The GSSAPI
encryption patch does that. Yours does not with stuff like that:retry4:
- nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd,
- conn->inBufSize - conn->inEnd);This makes the whole error handling more consistent, and the new option
layer as well more consistent with what happens in SSL, except that you
want to be able to combine SSL and compression as well so you need an
extra process which decompresses/compresses the data after doing a "raw"
or "ssl" read/write. I have not actually looked much at your patch, but
I am wondering if it could not be possible to make the whole footprint
less invasive which really worries me as now shaped. As you need to
register functions with say zpq_create(), it would be instinctively
nicer to do the handling directly in secure_read() and such.
Once again sorry, but i still do not understand the problem here.
If compression is anabled, then I am using zpq_read instead of
secure_read/pqsecure_read. But� zpq_read is in turn calling
secure_read/pqsecure_read
to fetch more raw data. So if "ecure_read or pqsecure_read are entry
points which switch method depending on the connection details", then�
compression is not preventing them from making this choice. Compression
should be done prior to encryption otherwise compression will have no sense.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 06.06.2018 02:03, Thomas Munro wrote:
On Wed, Jun 6, 2018 at 2:06 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:Thank you for review. Updated version of the patch fixing all reported
problems is attached.Small problem on Windows[1]:
C:\projects\postgresql\src\include\common/zpq_stream.h(17): error
C2143: syntax error : missing ')' before '*'
[C:\projects\postgresql\libpq.vcxproj]
2395You used ssize_t in zpq_stream.h, but Windows doesn't have that type.
We have our own typedef in win32_port.h. Perhaps zpq_stream.c should
include postgres.h/postgres_fe.h (depending on FRONTEND) like the
other .c files in src/common, before it includes zpq_stream.h?
Instead of "c.h".[1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1106
Thank you very much for reporting the problem.
I attached new patch with include of postgres_fe.h added to zpq_stream.c
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company