Compressed Backup too big
"PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)"
Database size in disk returned by pg_database_size() is 210 MB
Database compressesed backup file size is now 125 MB.
This seems too much. I expect compression to decrease size 10 times, also
indexes are not backed up. A year ago compressed backup size was 9 MB only.
I created query returning biggest tables with and without indexes and found:
1 pg_toast_22185 95 MB 96 MB
2 rid 21 MB 27 MB
3 klient 13 MB 19 MB
4 mailbox 10 MB 11 MB
5 dok 7640 kB 12 MB
6 desktop 8080 kB 8200 kB
7 strings 5536 kB 6584 kB
8 pg_toast_22338 5232 kB 5368 kB
...
Questions:
1. Tables are relatively small and thus cannot create 125 MB compressed
backup file.
Why backup file sis so big ?
2. How to determine what data is containing in pg_toast_22185 ?
Why this is so big ?
Andrus.
2007/11/15, Andrus <kobruleht2@hot.ee>:
"PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)"
Database size in disk returned by pg_database_size() is 210 MBDatabase compressesed backup file size is now 125 MB.
How do you produce this dump? pg_dump -Fc ?
If you create plain dump ( pg_dump DBNAME > dump.file ) and zip it
(normal deflate, like winzip or sth like that), what's the size?
If this is much smaller than 125MB, maybe you got some corner case
with postgres builtin compression.
You cannot *always* expect 10x ratio... it depends on data. Maybe
usage patterns of your database changed and now the data is less
compressable?
--
Filip Rembiałkowski
On Thu, 15 Nov 2007 20:35:35 +0200
"Andrus" <kobruleht2@hot.ee> wrote:
"PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)"
Database size in disk returned by pg_database_size() is 210 MBDatabase compressesed backup file size is now 125 MB.
This seems too much. I expect compression to decrease size 10 times, also
indexes are not backed up. A year ago compressed backup size was 9 MB only.I created query returning biggest tables with and without indexes and found:
1 pg_toast_22185 95 MB 96 MB
2 rid 21 MB 27 MB
3 klient 13 MB 19 MB
4 mailbox 10 MB 11 MB
5 dok 7640 kB 12 MB
6 desktop 8080 kB 8200 kB
7 strings 5536 kB 6584 kB
8 pg_toast_22338 5232 kB 5368 kB...
Questions:
1. Tables are relatively small and thus cannot create 125 MB compressed
backup file.
Why backup file sis so big ?
I assume that "mailbox" contains e-mail messages as the name suggest... One can assume that those emails also have attachments like pdf, jpg and other mostly incompressible file format.
In my understanding of postgresql, when you a have a variable length field (like text or bytea), pg will only stores the first bytes (hundreds or so) of that field in the main table and then put the rest in an "hidden[1]hidden: transparent from an user point of view" TOAST table.
2. How to determine what data is containing in pg_toast_22185 ?
If you have PGAdmin at hand or you can install it, its quite easy, just enable "Show System Objects" under the "View" menu [2]you can also obtain those information from the information schema IIRC. Based on my assumptions, I guess it's the TOAST of the field which contains the body of your emails...
Enough wild guesses,
HTH,
--
MaXX
[1]: hidden: transparent from an user point of view
[2]: you can also obtain those information from the information schema IIRC
Andrus wrote:
2. How to determine what data is containing in pg_toast_22185 ?
Why this is so big ?
That will be determined by the columns you define and the data you insert.
Read http://www.postgresql.org/docs/8.2/interactive/storage-toast.html
to get an explanation of TOAST and how you can change the settings if
you wish.
This is most likely the reason for your "lack of compression" - the data
in the toast tables is most likely already compressed data so will be
close to a 1-1 export size.
--
Shane Ambler
pgSQL@Sheeky.Biz
Get Sheeky @ http://Sheeky.Biz
"Andrus" <kobruleht2@hot.ee> writes:
I created query returning biggest tables with and without indexes and found:
1 pg_toast_22185 95 MB 96 MB
1. Tables are relatively small and thus cannot create 125 MB compressed
backup file.
You have failed to consider that data that goes into a toast table is
usually compressed first. The above table is probably accounting for
close to 100MB of compressed dump output.
regards, tom lane
On Thu, 2007-11-15 at 20:35 +0200, Andrus wrote:
"PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)"
Database size in disk returned by pg_database_size() is 210 MBDatabase compressesed backup file size is now 125 MB.
This seems too much. I expect compression to decrease size 10 times, also
indexes are not backed up. A year ago compressed backup size was 9 MB only.I created query returning biggest tables with and without indexes and found:
1 pg_toast_22185 95 MB 96 MB
2 rid 21 MB 27 MB
3 klient 13 MB 19 MB
4 mailbox 10 MB 11 MB
5 dok 7640 kB 12 MB
6 desktop 8080 kB 8200 kB
7 strings 5536 kB 6584 kB
8 pg_toast_22338 5232 kB 5368 kB...
Questions:
1. Tables are relatively small and thus cannot create 125 MB compressed
backup file.
Why backup file sis so big ?
Is this a pg_dump backup or a PITR style backup?
If it's a pg_dump backup, you can open it up in an editor to find out
what's taking so much space.
2. How to determine what data is containing in pg_toast_22185 ?
Why this is so big ?
Could it be that you haven't been VACUUMing properly? Possibly you need
to run a VACUUM FULL if you haven't kept up. If it's a PITR style backup
on 1, that could be the same reason.
To find what table has pg_toast_22185, try:
SELECT relname FROM pg_class WHERE oid=22185
//Magnus
Filip Rembiałkowski wrote:
2007/11/15, Andrus <kobruleht2@hot.ee>:
"PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)"
Database size in disk returned by pg_database_size() is 210 MBDatabase compressesed backup file size is now 125 MB.
How do you produce this dump? pg_dump -Fc ?
If you create plain dump ( pg_dump DBNAME > dump.file ) and zip it
(normal deflate, like winzip or sth like that), what's the size?If this is much smaller than 125MB, maybe you got some corner case
with postgres builtin compression.You cannot *always* expect 10x ratio... it depends on data. Maybe
usage patterns of your database changed and now the data is less
compressable?
We find we get much, much better compression using bzip2.
(It would be wonderful if pg_restore supported this natively, but I don't
know enough to know if it is possible or not.)
--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/