Is a plan for lmza commpression in pg_dump
Hi.
Is it in todo or in a plan to implement lmza commpression in pg_dump
backups?
Thanks
Stano
--
------------------------------------------------------------------------
Space Systems
*Mgr. Stano LACKO*
mobil: +421 908 175 753
fax.: +421 2 555 72 676
e-mail: lacko@spacesystems.sk <mailto:lacko@spacesystems.sk>
*Space Systems, s.r.o.*
Zámocká 30
811 01 Bratislava
www.spacesystems.sk <http://www.spacesystems.sk/>
Attachments:
Stanislav Lacko wrote:
Hi.
Is it in todo or in a plan to implement lmza commpression in pg_dump
backups?
Nope, never heard anything about it.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
owner@postgresql.org] On Behalf Of Bruce Momjian
Sent: Wednesday, February 04, 2009 3:28 PM
To: Stanislav Lacko
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Is a plan for lmza commpression in pg_dumpStanislav Lacko wrote:
Hi.
Is it in todo or in a plan to implement lmza commpression in pg_dump
backups?Nope, never heard anything about it.
In case the PG group does get interested in insertion of compression
algorithms into PostgreSQL {it seems it could be useful in many
different areas}, the 7zip format seems to be excellent in a number of
ways.
Here is an interesting benchmark that shows 7z format winning a large
area of the "optimal compressors" performance graph:
http://users.elis.ugent.be/~wheirman/compression/
The LZMA SDK is granted to the public domain:
http://www.7-zip.org/sdk.html
Unfortunately LZOP (which wins the top half of the "optimal compressors"
graph where the compression and decompression speed is more important
than amount of compression) does not have a liberal license.
http://www.lzop.org/
Dann Corbit wrote:
The LZMA SDK is granted to the public domain:
http://www.7-zip.org/sdk.html
I played with this but found the SDK extremely confusing and flat out horrible.
One personal dislike was the unnecessary use of C++; although it was the
horrible API that turned me off. I'm not even sure if I ever got a test program
working.
LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy API
with many variants; my fav is LZO1X-1(15). Its known for its compresison and
decompresison speeds ... its blazing fast. zlib typically gets 5-8% more
compression.
--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/
On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
Dann Corbit wrote:
The LZMA SDK is granted to the public domain:
http://www.7-zip.org/sdk.htmlI played with this but found the SDK extremely confusing and flat out
horrible. One personal dislike was the unnecessary use of C++; although it
was the horrible API that turned me off. I'm not even sure if I ever got a
test program working.LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy
API with many variants; my fav is LZO1X-1(15). Its known for its
compresison and decompresison speeds ... its blazing fast. zlib typically
gets 5-8% more compression.
LZO rocks. I wonder if the lzo developer would consider a license exception
so that postgresql could use it? What would we need?
-dg
--
David Gould daveg@sonic.net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.
daveg wrote:
On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
Dann Corbit wrote:
The LZMA SDK is granted to the public domain:
http://www.7-zip.org/sdk.htmlI played with this but found the SDK extremely confusing and flat out
horrible. One personal dislike was the unnecessary use of C++; although it
was the horrible API that turned me off. I'm not even sure if I ever got a
test program working.LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy
API with many variants; my fav is LZO1X-1(15). Its known for its
compresison and decompresison speeds ... its blazing fast. zlib typically
gets 5-8% more compression.LZO rocks. I wonder if the lzo developer would consider a license exception
so that postgresql could use it? What would we need?
Probably a BSD license or a clean room implementation which we could BSD
license.
cheers
andrew
daveg wrote:
On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
Dann Corbit wrote:
The LZMA SDK is granted to the public domain:
http://www.7-zip.org/sdk.htmlI played with this but found the SDK extremely confusing and flat out
horrible. One personal dislike was the unnecessary use of C++; although it
was the horrible API that turned me off. I'm not even sure if I ever got a
test program working.LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy
API with many variants; my fav is LZO1X-1(15). Its known for its
compresison and decompresison speeds ... its blazing fast. zlib typically
gets 5-8% more compression.LZO rocks. I wonder if the lzo developer would consider a license exception
so that postgresql could use it? What would we need?
The chance of us using anything but one zlib is near zero so please do
not persue this; this discussion comes up much too often.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Sat, Feb 07, 2009 at 02:47:05PM -0500, Bruce Momjian wrote:
daveg wrote:
On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
Dann Corbit wrote:
The LZMA SDK is granted to the public domain:
http://www.7-zip.org/sdk.htmlI played with this but found the SDK extremely confusing and flat out
horrible. One personal dislike was the unnecessary use of C++; although it
was the horrible API that turned me off. I'm not even sure if I ever got a
test program working.LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy
API with many variants; my fav is LZO1X-1(15). Its known for its
compresison and decompresison speeds ... its blazing fast. zlib typically
gets 5-8% more compression.LZO rocks. I wonder if the lzo developer would consider a license exception
so that postgresql could use it? What would we need?The chance of us using anything but one zlib is near zero so please do
not persue this; this discussion comes up much too often.
That this comes up "much to often" suggests that there is more than near
zero interest. Why can only one compression library can be considered?
We use multiple readline implementations, for better or worse.
I think the context here is for pg_dump only and in that context a faster
compression library makes a lot of sense. I'd be happy to prepare a patch
if the license issue can be accomodated. Hence my question, what sort of
licence accomodation would we need to be able to use this library?
-dg
--
David Gould daveg@sonic.net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.
On 7 Feb 2009, at 21:08, daveg wrote:
That this comes up "much to often" suggests that there is more than
near
zero interest. Why can only one compression library can be
considered?
We use multiple readline implementations, for better or worse.
I don't see anything wrong with using standard unix pipes... and do it
in truly unix and scalable way !
That this comes up "much to often" suggests that there is more than near
zero interest. Why can only one compression library can be considered?
We use multiple readline implementations, for better or worse.I think the context here is for pg_dump only and in that context a faster
compression library makes a lot of sense. I'd be happy to prepare a patch
if the license issue can be accomodated. Hence my question, what sort of
licence accomodation would we need to be able to use this library?
Based on previous discussions, I suspect that the answer here is
"complete relicensing as BSD". I think pursuing any sort of licensing
exception is completely futile as there will still be restrictions
that will be unacceptable to many in the community.
But if someone had an actual BSD-LICENSED compression library that was
better than what we have now, I'm not sure why Bruce (or anyone)
should be opposed to incorporating it. It's just that all of the
proposals that come up for this sort of thing aren't that.
...Robert
Robert Haas wrote:
That this comes up "much to often" suggests that there is more than near
zero interest. Why can only one compression library can be considered?
We use multiple readline implementations, for better or worse.I think the context here is for pg_dump only and in that context a faster
compression library makes a lot of sense. I'd be happy to prepare a patch
if the license issue can be accomodated. Hence my question, what sort of
licence accomodation would we need to be able to use this library?Based on previous discussions, I suspect that the answer here is
"complete relicensing as BSD". I think pursuing any sort of licensing
exception is completely futile as there will still be restrictions
that will be unacceptable to many in the community.But if someone had an actual BSD-LICENSED compression library that was
better than what we have now, I'm not sure why Bruce (or anyone)
should be opposed to incorporating it. It's just that all of the
proposals that come up for this sort of thing aren't that.
You can be I would oppose it. It is not efficient for us to support
every compression-of-the-month project that comes along. If something
was BSD, well tested, and clearly superior, we might consider it, but I
have seen nothing like that for 10 years and I doubt I will see
something the next 5. I am thinking we need to add this to the
"Features we do not want" section of our todo list.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Feb 7, 2009, at 4:53 PM, Bruce Momjian <bruce@momjian.us> wrote:
Robert Haas wrote:
That this comes up "much to often" suggests that there is more
than near
zero interest. Why can only one compression library can be
considered?
We use multiple readline implementations, for better or worse.I think the context here is for pg_dump only and in that context a
faster
compression library makes a lot of sense. I'd be happy to prepare
a patch
if the license issue can be accomodated. Hence my question, what
sort of
licence accomodation would we need to be able to use this library?Based on previous discussions, I suspect that the answer here is
"complete relicensing as BSD". I think pursuing any sort of
licensing
exception is completely futile as there will still be restrictions
that will be unacceptable to many in the community.But if someone had an actual BSD-LICENSED compression library that
was
better than what we have now, I'm not sure why Bruce (or anyone)
should be opposed to incorporating it. It's just that all of the
proposals that come up for this sort of thing aren't that.You can be I would oppose it. It is not efficient for us to support
every compression-of-the-month project that comes along. If something
was BSD, well tested, and clearly superior, we might consider it,
but I
Well that's pretty much what I said.
have seen nothing like that for 10 years and I doubt I will see
something the next 5. I am thinking
I am doubtful too.
we need to add this to the
"Features we do not want" section of our todo list.
"Proprietary compression algorithms, even with Postgresql-specific
license exceptions"?
...Robert
Robert Haas wrote:
have seen nothing like that for 10 years and I doubt I will see
something the next 5. I am thinkingI am doubtful too.
we need to add this to the
"Features we do not want" section of our todo list."Proprietary compression algorithms, even with Postgresql-specific
license exceptions"?
Yep. Does it make sense to make our license more complex to get 1%
percent better compression in certain cases? Probably not. Also
consider the code maintenance, patents, larger tarball, bugs, etc.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Sat, Feb 07, 2009 at 08:49:29PM -0500, Robert Haas wrote:
On Feb 7, 2009, at 4:53 PM, Bruce Momjian <bruce@momjian.us> wrote:
we need to add this to the "Features we do not want" section of our
todo list."Proprietary compression algorithms, even with Postgresql-specific
license exceptions"?
Considering that the entire project ships with a BSD license, which
very specifically allows use of all or any tiniest part of it with
(skipping some legalese) two restrictions: mention PGDG in the
copyright list, and don't sue us no matter what happens, any
"Postgresql-specific license exceptions" are equivalent to "that
algorithm is no longer proprietary" because any project could simply
use PostgreSQL's version and have done.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sat, Feb 07, 2009 at 08:49:29PM -0500, Robert Haas wrote:
"Proprietary compression algorithms, even with Postgresql-specific
license exceptions"?
To be fair, lzo is GPL, which is a stretch to consider proprietary.
-dg
--
David Gould daveg@sonic.net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.
On Sat, Feb 07, 2009 at 08:31:23PM -0800, David Fetter wrote:
Considering that the entire project ships with a BSD license, which
very specifically allows use of all or any tiniest part of it with
(skipping some legalese) two restrictions: mention PGDG in the
copyright list, and don't sue us no matter what happens, any
"Postgresql-specific license exceptions" are equivalent to "that
algorithm is no longer proprietary" because any project could simply
use PostgreSQL's version and have done.
Why don't we just add an option to pg_dump --use-compress-program, just
like tar and then people can use their "compression algorithm of the
week" and we don't need to care about the licence or anything.
It's not like the case of TOAST where it actually needs to be builtin.
Tar doesn't have any compression builtin, yet you don't see many
uncompressed tar files...
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.
On 8 Feb 2009, at 02:49, Robert Haas <robertmhaas@gmail.com> wrote:
On Feb 7, 2009, at 4:53 PM, Bruce Momjian <bruce@momjian.us> wrote:
we need to add this to the
"Features we do not want" section of our todo list."Proprietary compression algorithms, even with Postgresql-specific
license exceptions"?
Now that I would agree about. We would have to explain that we're bsd
licenced *because* we want people to be able to reuse our code outside
postgres including commercial projects
Why don't we just add an option to pg_dump --use-compress-program, just
like tar and then people can use their "compression algorithm of the
week" and we don't need to care about the licence or anything.
Can't this be done already?
pg_dump -Z 0 | compression_binary >mydump
If -Z is unspecified, I think it won't compress? Maybe you can just drop the -Z.
--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/
Martijn van Oosterhout wrote:
Why don't we just add an option to pg_dump --use-compress-program, just
like tar and then people can use their "compression algorithm of the
week" and we don't need to care about the licence or anything.It's not like the case of TOAST where it actually needs to be builtin.
Tar doesn't have any compression builtin, yet you don't see many
uncompressed tar files...
tar compresses/decompresses the whole archive via a single pipe. pg_dump
compresses individual data members. If the compression isn't builtin it
will make life much more difficult, and probably make parallel restore
as well as some other operations well nigh impossible.
cheers
andrew
daveg wrote:
I think the context here is for pg_dump only and in that context a faster
compression library makes a lot of sense. I'd be happy to prepare a patch
if the license issue can be accomodated.
Some kind of performance data (space and time) would be required to
support any change in this area.
Notice that the thread originally called for lzma support, which is
completely at the opposite end of the spectrum of compression algorithms
in terms of space and time, compared to lzo. So it's not really clear
what the requirements are in the first place.
Peter Eisentraut wrote:
Notice that the thread originally called for lzma support, which is
completely at the opposite end of the spectrum of compression algorithms
in terms of space and time, compared to lzo. So it's not really clear
what the requirements are in the first place.
Instead of trying to figure out the needs/wants of a DBA, a general purpose
solution, it might be better to figure out how to make the compression choice
user-driven. Maybe the requirement should be to make this the user's decision;
pipe'n the output to the compression of choice seems to be the simplest approach.
There are cases the highest compression is desired even if it takes forever, and
cases for just the opposite. Not sure why this has to be builtin or why it much
use zlib, other than this is the current method.
--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/