Compressing table images
My apologies if this subject has already been hashed to death, or if
this is the wrong list, but I was wondering if people had seen this paper:
http://www.cwi.nl/htbin/ins1/publications?request=intabstract&key=ZuHeNeBo:ICDE:06
Basically it describes a compression algorithm for tables of a
database. The huge advantage of doing this is that it reduced the disk
traffic by (approximately) a factor of four- at the cost of more CPU
utilization.
Any thoughts or comments?
Brian
Brian Hurt wrote:
My apologies if this subject has already been hashed to death, or if
this is the wrong list, but I was wondering if people had seen this paper:
http://www.cwi.nl/htbin/ins1/publications?request=intabstract&key=ZuHeNeBo:ICDE:06Basically it describes a compression algorithm for tables of a
database. The huge advantage of doing this is that it reduced the disk
traffic by (approximately) a factor of four- at the cost of more CPU
utilization.
Any thoughts or comments?
I don't know if that is the algorithm we use but PostgreSQL will
compress its data within the table.
Joshua D. Drake
Brian
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/
Joshua D. Drake wrote:
Brian Hurt wrote:
My apologies if this subject has already been hashed to death, or if
this is the wrong list, but I was wondering if people had seen this paper:
http://www.cwi.nl/htbin/ins1/publications?request=intabstract&key=ZuHeNeBo:ICDE:06Basically it describes a compression algorithm for tables of a
database. The huge advantage of doing this is that it reduced the disk
traffic by (approximately) a factor of four- at the cost of more CPU
utilization.
Any thoughts or comments?I don't know if that is the algorithm we use but PostgreSQL will
compress its data within the table.
But only in certain very specific cases. And we compress on a
per-attribute basis. Compressing at the page level is pretty much out
of the question; but compressing at the tuple level I think is doable.
How much benefit that brings is another matter. I think we still have
more use for our limited manpower elsewhere.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, May 11, 2006 at 05:05:26PM -0400, Alvaro Herrera wrote:
Joshua D. Drake wrote:
Brian Hurt wrote:
My apologies if this subject has already been hashed to death, or if
this is the wrong list, but I was wondering if people had seen this paper:
http://www.cwi.nl/htbin/ins1/publications?request=intabstract&key=ZuHeNeBo:ICDE:06Basically it describes a compression algorithm for tables of a
database. The huge advantage of doing this is that it reduced the disk
traffic by (approximately) a factor of four- at the cost of more CPU
utilization.
Any thoughts or comments?I don't know if that is the algorithm we use but PostgreSQL will
compress its data within the table.But only in certain very specific cases. And we compress on a
per-attribute basis. Compressing at the page level is pretty much out
of the question; but compressing at the tuple level I think is doable.
How much benefit that brings is another matter. I think we still have
more use for our limited manpower elsewhere.
Except that I think it would be highly useful to allow users to change
the limits used for both toasting and compressing on a per-table and/or
per-field basis. For example, if you have a varchar(1500) in a table
it's unlikely to ever be large enough to trigger toasting, but if that
field is rarely updated it could be a big win to store it toasted. Of
course you can always create a 'side table' (vertical partitioning), but
all of that framework already exists in the database; we just don't
provide the required knobs. I suspect it wouldn't be that hard to expose
those knobs. In fact, if we could agree on syntax, this is probably a
beginner TODO.
ISTR having this discussion on one of the lists recently, but I can't
find it, and don't see anything in the TODO. Basically, I think we'd
want knobs that say: if this field is over X size, compress it. If it's
over Y size, store it externally. Per-table and per-cluster (ie: GUC)
knobs for that would be damn handy as well.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461