forcing compression of text field

Started by Jonathan Ellisover 19 years ago6 messagesgeneral
Jump to latest
#1Jonathan Ellis
jonathan@utahpython.org

I have a table of log messages. They are mostly in the 100-200
character length, which apparently isn't large enough for PG to want
to compress it (length == octet_length). I really need to save disk
space. I can store it as a bytea and compress it manually (zlib level
1 compression gives about 50% savings), but is there a way to force
pg's own compression before I resort to this?

--
Jonathan Ellis
http://spyced.blogspot.com

#2Jeff Davis
pgsql@j-davis.com
In reply to: Jonathan Ellis (#1)
Re: forcing compression of text field

On Mon, 2006-12-11 at 09:18 -0700, Jonathan Ellis wrote:

I have a table of log messages. They are mostly in the 100-200
character length, which apparently isn't large enough for PG to want
to compress it (length == octet_length). I really need to save disk
space. I can store it as a bytea and compress it manually (zlib level
1 compression gives about 50% savings), but is there a way to force
pg's own compression before I resort to this?

Are you sure PostgreSQL isn't compressing it? I didn't think there was a
minimum threshold for compression.

Regards,
Jeff Davis

#3Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Jonathan Ellis (#1)
Re: forcing compression of text field

On Mon, 2006-12-11 at 10:18, Jonathan Ellis wrote:

I have a table of log messages. They are mostly in the 100-200
character length, which apparently isn't large enough for PG to want
to compress it (length == octet_length). I really need to save disk
space. I can store it as a bytea and compress it manually (zlib level
1 compression gives about 50% savings), but is there a way to force
pg's own compression before I resort to this?

http://www.postgresql.org/docs/8.1/interactive/storage-toast.html

Has all your answers.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Scott Marlowe (#3)
Re: forcing compression of text field

Scott Marlowe <smarlowe@g2switchworks.com> writes:

On Mon, 2006-12-11 at 10:18, Jonathan Ellis wrote:

I have a table of log messages. They are mostly in the 100-200
character length, which apparently isn't large enough for PG to want
to compress it (length == octet_length). I really need to save disk
space. I can store it as a bytea and compress it manually (zlib level
1 compression gives about 50% savings), but is there a way to force
pg's own compression before I resort to this?

http://www.postgresql.org/docs/8.1/interactive/storage-toast.html
Has all your answers.

The bottom line is that PG doesn't bother trying to compress values
less than about 2KB long. While you could make a custom build with a
different threshold, the fact remains that LZ-style compression is not
real efficient on short stretches of text. If you "really need to save
disk space" it behooves you to consider that. I'd suggest thinking about
whether you can merge multiple log entries, or something, such that the
field values you need to store are on the order of a few KB.

regards, tom lane

#5Ron Johnson
ron.l.johnson@cox.net
In reply to: Jonathan Ellis (#1)
Re: forcing compression of text field

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/11/06 10:18, Jonathan Ellis wrote:

I have a table of log messages. They are mostly in the 100-200
character length, which apparently isn't large enough for PG to want
to compress it (length == octet_length). I really need to save disk
space. I can store it as a bytea and compress it manually (zlib level
1 compression gives about 50% savings), but is there a way to force
pg's own compression before I resort to this?

What can be compressed? Trailing whitespace or repeating substrings?

- --
Ron Johnson, Jr.
Jefferson LA USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFFfsnvS9HxQb37XmcRAuaeAKDRv9IUDH4zenHoVQPST0vFbpHLkwCdED9k
rvvOkNCRx/J8EkGtBF2Bs9Y=
=XANI
-----END PGP SIGNATURE-----

#6Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#4)
Re: forcing compression of text field

Tom Lane wrote:

Scott Marlowe <smarlowe@g2switchworks.com> writes:

On Mon, 2006-12-11 at 10:18, Jonathan Ellis wrote:

I have a table of log messages. They are mostly in the 100-200
character length, which apparently isn't large enough for PG to want
to compress it (length == octet_length). I really need to save disk
space. I can store it as a bytea and compress it manually (zlib level
1 compression gives about 50% savings), but is there a way to force
pg's own compression before I resort to this?

http://www.postgresql.org/docs/8.1/interactive/storage-toast.html
Has all your answers.

The bottom line is that PG doesn't bother trying to compress values
less than about 2KB long. While you could make a custom build with a
different threshold, the fact remains that LZ-style compression is not
real efficient on short stretches of text. If you "really need to save
disk space" it behooves you to consider that. I'd suggest thinking about
whether you can merge multiple log entries, or something, such that the
field values you need to store are on the order of a few KB.

See ALTER TABLE ALTER [ COLUMN ] column SET STORAGE { PLAIN | EXTERNAL |
EXTENDED | MAIN }.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +