Compression
Is there any effort to add compression into PG, a la MySQL's
row_format=compressed or HBase's LZO block compression?
On Thursday, April 14, 2011 4:01:54 pm Yang Zhang wrote:
Is there any effort to add compression into PG, a la MySQL's
row_format=compressed or HBase's LZO block compression?
TOAST?
http://www.postgresql.org/docs/9.0/interactive/storage-toast.html
--
Adrian Klaver
adrian.klaver@gmail.com
On 15/04/2011 7:01 AM, Yang Zhang wrote:
Is there any effort to add compression into PG, a la MySQL's
row_format=compressed or HBase's LZO block compression?
There's no row compression, but as mentioned by others there is
out-of-line compression of large values using TOAST.
Row compression would be interesting, but I can't imagine it not having
been investigated already.
--
Craig Ringer
Tech-related writing at http://soapyfrogs.blogspot.com/
On Thursday, April 14, 2011 4:50:44 pm Craig Ringer wrote:
On 15/04/2011 7:01 AM, Yang Zhang wrote:
Is there any effort to add compression into PG, a la MySQL's
row_format=compressed or HBase's LZO block compression?There's no row compression, but as mentioned by others there is
out-of-line compression of large values using TOAST.
I could be misunderstanding but I thought compression happened in the row as
well. From the docs:
"EXTENDED allows both compression and out-of-line storage. This is the default
for most TOAST-able data types. Compression will be attempted first, then out-of-
line storage if the row is still too big. "
Row compression would be interesting, but I can't imagine it not having
been investigated already.
--
Adrian Klaver
adrian.klaver@gmail.com
On Thu, Apr 14, 2011 at 5:07 PM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On Thursday, April 14, 2011 4:50:44 pm Craig Ringer wrote:
On 15/04/2011 7:01 AM, Yang Zhang wrote:
Is there any effort to add compression into PG, a la MySQL's
row_format=compressed or HBase's LZO block compression?
There's no row compression, but as mentioned by others there is
out-of-line compression of large values using TOAST.
I could be misunderstanding but I thought compression happened in the row as
well. From the docs:"EXTENDED allows both compression and out-of-line storage. This is the
default for most TOAST-able data types. Compression will be attempted first,
then out-of-line storage if the row is still too big. "
Row compression would be interesting, but I can't imagine it not having
been investigated already.
--
Adrian Klaver
adrian.klaver@gmail.com
Already know about TOAST. I could've been clearer, but that's not the
same as the block-/page-level compression I was referring to.
--
Yang Zhang
http://yz.mit.edu/
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Yang Zhang
Sent: Thursday, April 14, 2011 6:51 PM
To: Adrian Klaver
Cc: pgsql-general@postgresql.org; Craig Ringer
Subject: Re: [GENERAL] CompressionOn Thu, Apr 14, 2011 at 5:07 PM, Adrian Klaver
<adrian.klaver@gmail.com> wrote:On Thursday, April 14, 2011 4:50:44 pm Craig Ringer wrote:
On 15/04/2011 7:01 AM, Yang Zhang wrote:
Is there any effort to add compression into PG, a la MySQL's
row_format=compressed or HBase's LZO block compression?
There's no row compression, but as mentioned by others there is
out-of-line compression of large values using TOAST.
I could be misunderstanding but I thought compression happened in the
row as
well. From the docs:
"EXTENDED allows both compression and out-of-line storage. This is
the
default for most TOAST-able data types. Compression will be attempted
first,
then out-of-
line storage if the row is still too big. "
Row compression would be interesting, but I can't imagine it not
having
been investigated already.
--
Adrian Klaver
adrian.klaver@gmail.com
Already know about TOAST. I could've been clearer, but that's not the
same as the block-/page-level compression I was referring to.
There is a (closed source) PG fork that has row (or column) oriented storage
that can have compression applied to them.... if you are willing to give up
updates and deletes on the table that is.
I haven't seen a lot of people talking about wanting that in the Postgres
core tho.
-M
Show quoted text
--
Yang Zhang
http://yz.mit.edu/--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thursday, April 14, 2011 5:51:21 pm Yang Zhang wrote:
adrian.klaver@gmail.com
Already know about TOAST. I could've been clearer, but that's not the
same as the block-/page-level compression I was referring to.
I am obviously missing something. The TOAST mechanism is designed to keep tuple
data below the default 8KB page size. In fact it kicks in at a lower level than
that:
"The TOAST code is triggered only when a row value to be stored in a table is
wider than TOAST_TUPLE_THRESHOLD bytes (normally 2 kB). The TOAST code will
compress and/or move field values out-of-line until the row value is shorter than
TOAST_TUPLE_TARGET bytes (also normally 2 kB) or no more gains can be had.
During an UPDATE operation, values of unchanged fields are normally preserved as-
is; so an UPDATE of a row with out-of-line values incurs no TOAST costs if none
of the out-of-line values change.'
Granted no all data types are TOASTable. Are you looking for something more
aggressive than that?
--
Adrian Klaver
adrian.klaver@gmail.com
On Thu, Apr 14, 2011 at 7:42 PM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On Thursday, April 14, 2011 5:51:21 pm Yang Zhang wrote:
adrian.klaver@gmail.com
Already know about TOAST. I could've been clearer, but that's not the
same as the block-/page-level compression I was referring to.
I am obviously missing something. The TOAST mechanism is designed to keep
tuple data below the default 8KB page size. In fact it kicks in at a lower
level than that:"The TOAST code is triggered only when a row value to be stored in a table
is wider than TOAST_TUPLE_THRESHOLD bytes (normally 2 kB). The TOAST code
will compress and/or move field values out-of-line until the row value is
shorter than TOAST_TUPLE_TARGET bytes (also normally 2 kB) or no more gains
can be had. During an UPDATE operation, values of unchanged fields are
normally preserved as-is; so an UPDATE of a row with out-of-line values
incurs no TOAST costs if none of the out-of-line values change.'Granted no all data types are TOASTable. Are you looking for something more
aggressive than that?
Yes.
http://blog.oskarsson.nu/2009/03/hadoop-feat-lzo-save-disk-space-and.html
http://wiki.apache.org/hadoop/UsingLzoCompression
http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-compression-internals-algorithms.html
--
Adrian Klaver
adrian.klaver@gmail.com
--
Yang Zhang
http://yz.mit.edu/
On Thursday, April 14, 2011 7:46:34 pm Yang Zhang wrote:
On Thu, Apr 14, 2011 at 7:42 PM, Adrian Klaver <adrian.klaver@gmail.com>
wrote:
Granted no all data types are TOASTable. Are you looking for something
more aggressive than that?Yes.
http://blog.oskarsson.nu/2009/03/hadoop-feat-lzo-save-disk-space-and.html
http://wiki.apache.org/hadoop/UsingLzoCompression
http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-compression-internals-
algorithms.html
I can see that as a another use case for SQL/MED in 9.1+.
--
Adrian Klaver
adrian.klaver@gmail.com
--
Adrian Klaver
adrian.klaver@gmail.com
On Thu, Apr 14, 2011 at 6:46 PM, mark <dvlhntr@gmail.com> wrote:
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Yang Zhang
Sent: Thursday, April 14, 2011 6:51 PM
To: Adrian Klaver
Cc: pgsql-general@postgresql.org; Craig Ringer
Subject: Re: [GENERAL] CompressionOn Thu, Apr 14, 2011 at 5:07 PM, Adrian Klaver
<adrian.klaver@gmail.com> wrote:On Thursday, April 14, 2011 4:50:44 pm Craig Ringer wrote:
On 15/04/2011 7:01 AM, Yang Zhang wrote:
Is there any effort to add compression into PG, a la MySQL's
row_format=compressed or HBase's LZO block compression?
There's no row compression, but as mentioned by others there is
out-of-line compression of large values using TOAST.
I could be misunderstanding but I thought compression happened in the
row as
well. From the docs:
"EXTENDED allows both compression and out-of-line storage. This is
the
default for most TOAST-able data types. Compression will be attempted
first,
then out-of-
line storage if the row is still too big. "
Row compression would be interesting, but I can't imagine it not
having
been investigated already.
--
Adrian Klaver
adrian.klaver@gmail.com
Already know about TOAST. I could've been clearer, but that's not the
same as the block-/page-level compression I was referring to.There is a (closed source) PG fork that has row (or column) oriented storage
that can have compression applied to them.... if you are willing to give up
updates and deletes on the table that is.
Greenplum and Aster?
We *are* mainly doing analytical (non-updating/deleting) processing.
But it's not a critical pain point - we're mainly interested in FOSS
for now.
I haven't seen a lot of people talking about wanting that in the Postgres
core tho.-M
--
Yang Zhang
http://yz.mit.edu/--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Yang Zhang
http://yz.mit.edu/
On 15/04/2011 8:07 AM, Adrian Klaver wrote:
"EXTENDED allows both compression and out-of-line storage. This is the
default for most TOAST-able data types. Compression will be attempted
first, then out-of-line storage if the row is still too big. "
Good point. I was unclear; thanks for pointing it out.
What I was trying to say is that there's no whole-row compression, ie
compression of the whole tuple except for minimal headers. A value in a
field may be compressed, but you can't (say) compress a 100-column row
of integers in Pg, because the individual fields don't support compression.
--
Craig Ringer
Tech-related writing at http://soapyfrogs.blogspot.com/
On Thursday, April 14, 2011 9:37:10 pm Craig Ringer wrote:
On 15/04/2011 8:07 AM, Adrian Klaver wrote:
"EXTENDED allows both compression and out-of-line storage. This is the
default for most TOAST-able data types. Compression will be attempted
first, then out-of-line storage if the row is still too big. "
Good point. I was unclear; thanks for pointing it out.
What I was trying to say is that there's no whole-row compression, ie
compression of the whole tuple except for minimal headers. A value in a
field may be compressed, but you can't (say) compress a 100-column row
of integers in Pg, because the individual fields don't support compression.
Got it now, thanks.
--
Adrian Klaver
adrian.klaver@gmail.com
Where do I find more information about PG fork you mentioned?
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Compression-tp4304322p5727363.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.