database is bigger after dump/restore - why? (60 GB to 109 GB)
Hi. Last week our 60 GB database (per psql \l+) was (re-)replicated to
the DR site using SlonyI, and arrived 109 GB in size which caused a
problem as it filled up the filesystem on the DR server - we expected the
DR database to be the same size. Mystery.
Now just past weekend we upgraded our production server by pg_dump
and pg_restore, and again the database is 109 GB in size!
Most of our data is in a single table, which on the old server is 50 GB in
size and on the new server is 100 GB in size.
Could you please help us understand why a COPY of the data into a new
database (whether DR or the new server) results in different disk usage?
Somebody mentioned on the Slony users list that there is a kind of padding
that goes in that actually helps performance.
Is there a way to track disk usage MINUS the padding?
Thanks,
Aleksey
I've dumped the big table from the original database (where it is 61 GB in size)
and am restoring it into a test database to see what the size is after
the restore.
As it is now, our DR is offline because we did not expect the database to
nearly double in size upon COPY of the data into a new database. Would like to
understand what is going on. And would like to not have such a swell of data
upon transfer. Is there anything I can do, please?
Best,
Aleksey
On Tue, Feb 22, 2011 at 12:44 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
Show quoted text
Hi. Last week our 60 GB database (per psql \l+) was (re-)replicated to
the DR site using SlonyI, and arrived 109 GB in size which caused a
problem as it filled up the filesystem on the DR server - we expected the
DR database to be the same size. Mystery.Now just past weekend we upgraded our production server by pg_dump
and pg_restore, and again the database is 109 GB in size!Most of our data is in a single table, which on the old server is 50 GB in
size and on the new server is 100 GB in size.Could you please help us understand why a COPY of the data into a new
database (whether DR or the new server) results in different disk usage?Somebody mentioned on the Slony users list that there is a kind of padding
that goes in that actually helps performance.Is there a way to track disk usage MINUS the padding?
Thanks,
Aleksey
Hi Aleksey,
I've read your previous post, and although I'm not quite sure what is the
root cause, I have some questions and (maybe wild) guesses ...
1) Are those two machines (primary and DR) exactly the same? I mean CPU,
filesystem (including blocksize etc.)?
2) What about database encoding? I think that ASCII on primary and UTF-8
on the DR might have result in such difference in size (but maybe I'm
wrong).
3) How exactly have you measured the size of the database? Filesystem
(using 'du'), or from pg_* views?
4) The fact that a dump file is significantly smaller than the database is
not a big surprise - the reason is that some values are 'shorter' in ASCII
than in the original form (e.g. 32-bit integer 0 is encoded as a single
char '0' which means 1B instead of 4B).
5) Have you tried to use pageinspect contrib module? It allows you to find
out exactly how the data items are organized on a page, so I'd recommend
this:
a) compare table sizes using system catalogs, select the one with
greatest difference
b) use pageinspect to display details about one block of the table in
both databases and compare the result
Maybe this will help you to find out the real cause.
regards
Tomas
Show quoted text
I've dumped the big table from the original database (where it is 61 GB in
size)
and am restoring it into a test database to see what the size is after
the restore.As it is now, our DR is offline because we did not expect the database to
nearly double in size upon COPY of the data into a new database. Would
like to
understand what is going on. And would like to not have such a swell of
data
upon transfer. Is there anything I can do, please?Best,
AlekseyOn Tue, Feb 22, 2011 at 12:44 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:Hi. Last week our 60 GB database (per psql \l+) was (re-)replicated to
the DR site using SlonyI, and arrived 109 GB in size which caused a
problem as it filled up the filesystem on the DR server - we expected
the
DR database to be the same size. Mystery.Now just past weekend we upgraded our production server by pg_dump
and pg_restore, and again the database is 109 GB in size!Most of our data is in a single table, which on the old server is 50 GB
in
size and on the new server is 100 GB in size.Could you please help us understand why a COPY of the data into a new
database (whether DR or the new server) results in different disk usage?Somebody mentioned on the Slony users list that there is a kind of
padding
that goes in that actually helps performance.Is there a way to track disk usage MINUS the padding?
Thanks,
Aleksey--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Tue, Feb 22, 2011 at 9:44 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
Most of our data is in a single table, which on the old server is 50 GB in
size and on the new server is 100 GB in size.
Maybe the table the on new server has fillfactor less than 100 ?
Greetings
Marcin
On 24/02/11 15:24, marcin mank wrote:
On Tue, Feb 22, 2011 at 9:44 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:Most of our data is in a single table, which on the old server is 50 GB in
size and on the new server is 100 GB in size.Maybe the table the on new server has fillfactor less than 100 ?
That would be my immediate guess. Someone changed the fillfactor on the
table - that won't affect the existing data but will affect a restore.
--
Richard Huxton
Archonet Ltd
Hi. Thanks for your replies.
How do I check the fillfactor on the table, please?
(http://www.postgresql.org/docs/8.4/static/sql-createtable.html tells me how
to set it, but I haven't found yet how to read it.)
Same CPU, same filesystem, same blocksize - identical systems. Same model
of server. We made them identical on purpose.
The way I check table size including TOAST and Indexes is:
SELECT relname as "Table",
pg_size_pretty(pg_total_relation_size(relid)) As "Size" from
pg_catalog.pg_statio_user_tables ORDER BY
pg_total_relation_size(relid) DESC;
My largest table is 50 GB in size; when I pg_dump it, and then
pg_restore it, it becomes 100 GB in size.
How do I pg_restore it so that it is 50 GB in size? Is it a setting
to pg_dump or to pg_restore?
Aleksey
On 02/24/11 11:02 AM, Aleksey Tsalolikhin wrote:
How do I check the fillfactor on the table, please?
its in the field reloptions in pg_class. so...
select reloptions from pg_class where relname='tablename';
if tablename is non-unique, you'll need to qualify that with the OID of
the namespace (aka schema).
My largest table is 50 GB in size; when I pg_dump it, and then
pg_restore it, it becomes 100 GB in size.How do I pg_restore it so that it is 50 GB in size? Is it a setting
to pg_dump or to pg_restore?
are you truncating the table before restoring, or is this a restore into
a new database, or what?
On Thu, Feb 24, 2011 at 11:46 AM, John R Pierce <pierce@hogranch.com> wrote:
On 02/24/11 11:02 AM, Aleksey Tsalolikhin wrote:
How do I check the fillfactor on the table, please?
its in the field reloptions in pg_class. so...
select reloptions from pg_class where relname='tablename';
Thanks, John!
autovacuum_enabled=true is the only option set on this table
on both the source database and the target.
How do I pg_restore it so that it is 50 GB in size? Is it a setting
to pg_dump or to pg_restore?are you truncating the table before restoring, or is this a restore into a
new database, or what?
I've tried both. Slony truncates the table before copying it over, and I've
tryind pg_restore'ing it into a new database. In both cases, the 50 GB
table arrives as a 100 GB table.
Aleksey
On Thu, Feb 24, 2011 at 14:11, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
are you truncating the table before restoring, or is this a restore into a
new database, or what?I've tried both. Slony truncates the table before copying it over, and I've
tryind pg_restore'ing it into a new database. In both cases, the 50 GB
table arrives as a 100 GB table.
Are they both the same version of pg? I'm wondering if the 50GB
version is running 8.3 and the slave something >=8.4, IIRC I had a
table that grew quite a bit because of some TOAST changes in 8.4. If
so, I'd first try making the slave and master versions of pg match,
which is good practice anyhow.
If none of the above ring any bells, maybe some bloat is creeping in.
Check_postgres (http://bucardo.org/wiki/Check_postgres) has some nice
metrics for that. (I suppose you could also try CLUSTER and see ff
that makes a size difference).
On Thursday, February 24, 2011 1:11:44 pm Aleksey Tsalolikhin wrote:
On Thu, Feb 24, 2011 at 11:46 AM, John R Pierce <pierce@hogranch.com> wrote:
On 02/24/11 11:02 AM, Aleksey Tsalolikhin wrote:
How do I check the fillfactor on the table, please?
its in the field reloptions in pg_class. so...
select reloptions from pg_class where relname='tablename';
Thanks, John!
autovacuum_enabled=true is the only option set on this table
on both the source database and the target.How do I pg_restore it so that it is 50 GB in size? Is it a setting
to pg_dump or to pg_restore?are you truncating the table before restoring, or is this a restore into
a new database, or what?I've tried both. Slony truncates the table before copying it over, and
I've tryind pg_restore'ing it into a new database. In both cases, the 50
GB table arrives as a 100 GB table.Aleksey
What is the schema for this table and associated indexes?
Or if that is not available what is in the table?
Is the data the same in both for the problem table? I am somewhat suspicious
that the second db table is twice the size of the first. Almost like the data is
being duplicated.
From above what is the size of the pg_dump for that table only when you dump
from the original table versus from the table in the second db?
--
Adrian Klaver
adrian.klaver@gmail.com
Hi. We're running Postgres 8.4.4 everywhere.
I already have a pg_dump -Fc of the big table from the source, now
I am running a pg_dump -Fc on the recipient, to see if the size is different.
Then I will run a pg_dump as text, so I can diff the two files if they are
different in size.
Thanks!!
Aleksey
On Thursday, February 24, 2011 3:34:02 pm Aleksey Tsalolikhin wrote:
Hi. We're running Postgres 8.4.4 everywhere.
I already have a pg_dump -Fc of the big table from the source, now
I am running a pg_dump -Fc on the recipient, to see if the size is
different.
I thought you already had a pg_dump file that you where restoring to the second
db?
Then I will run a pg_dump as text, so I can diff the two files if they are
different in size.
You don't need to do that if the pg_dump was done using -Fc. You can use
pg_restore to dump a table to a file instead of a database. When it does that
the file will contain a plain text copy. Something like:
pg_restore -a -t really_big_table -f really_big_table_data.sql
Where -a is data only
Thanks!!
Aleksey
--
Adrian Klaver
adrian.klaver@gmail.com
On Thursday, February 24, 2011 3:48:35 pm Adrian Klaver wrote:
On Thursday, February 24, 2011 3:34:02 pm Aleksey Tsalolikhin wrote:
Hi. We're running Postgres 8.4.4 everywhere.
I already have a pg_dump -Fc of the big table from the source, now
I am running a pg_dump -Fc on the recipient, to see if the size is
different.I thought you already had a pg_dump file that you where restoring to the
second db?Then I will run a pg_dump as text, so I can diff the two files if they
are different in size.You don't need to do that if the pg_dump was done using -Fc. You can use
pg_restore to dump a table to a file instead of a database. When it does
Oops typo ^^^^ should be restore
that the file will contain a plain text copy. Something like:
pg_restore -a -t really_big_table -f really_big_table_data.sql
Where -a is data only
Thanks!!
Aleksey
--
Adrian Klaver
adrian.klaver@gmail.com
Aleksey Tsalolikhin <atsaloli.tech@gmail.com> writes:
Now just past weekend we upgraded our production server by pg_dump
and pg_restore, and again the database is 109 GB in size!
Which character encoding of the source and target database?
--
Sergey Burladyan
OK, just to recap:
database A has a table that is 50 GB in size (according to:
SELECT relname as "Table",
pg_size_pretty(pg_total_relation_size(relid)) As "Size" from
pg_catalog.pg_statio_user_tables ORDER BY
pg_total_relation_size(relid) DESC;
)
I pg_dump -Fc this table, which gives me a 9.8 GB file.
I then pg_restore this table into database B, which results in a
100 GB table, according to the same SQL query.
Database versions are identical:
A: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
B: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
Character encoding of the source and target databases are identical:
UTF8. (As reported by "psql -l".)
fillfactor is not set for this table, it is the default on both A and
B. (As reported by "select reloptions from pg_class where
relname='tablename';".)
If I pg_dump -Fc the 100 GB table on database B, I get a 9.9 GB file.
If I do a "select count(*) from tablename", I get identical results
on A and B (1,628,348 rows).
Adrian asked about the schema for this table. It is the same on A and B:
Table "tablename"
Column | Type | Modifiers
-----------------+--------------------------+-----------
column1 | character varying | not null
column2 | character varying |
column3 | character varying |
column4 | character varying |
column5 | timestamp with time zone |
column6 | character varying |
column7 | character varying |
column8 | character varying |
column9 | character varying |
column10 | character varying |
Indexes:
"tablename_pkey" PRIMARY KEY, btree (column1)
"tablename_column6_index" btree (column6)
"tablename_column9_index" btree (device_dnq)
"tablename_column8_index" btree (kdm_gmt_end)
"tablename_column7_index" btree (kdm_gmt_start)
When I pg_dump the 50 GB table, I get a 40 GB file.
When I pg_dump the 100 GB table, I get a 40 GB file.
so looks like it's something low-level, something about how the data is stored.
i've installed the "pageinspect" contrib module as Tomas suggested but
I don't know what to do with it or what to look at. I looked at the
manual for it but it's totally over my head right now.
What sections of the manual should I read to be able to use this
module? (there are 2167 pages in the whole Postgres 8.4 manual and I
don't have time to study the whole thing. :( I just need to study
enough to understand how to use pageinspect.)
(I'm not a Postgres DBA but I am having to become one to support our database.)
Thanks again for the help. I'll be at SCALE 9x tomorrow helping Joe
Conway and Bruce M and Richard B and company get the word out about
Postgres.
Best,
Aleksey
On 27 Feb 2011, at 9:49, Aleksey Tsalolikhin wrote:
Database versions are identical:
A: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
B: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bitCharacter encoding of the source and target databases are identical:
UTF8. (As reported by "psql -l".)
I noticed in your table definition that you seem to store timestamps in text-fields. Restoring those from text-fields shouldn't make any difference, but perhaps your locales are set up differently between the machines and cause some type of conversion to take place?
I know, that's a pretty wild guess.
When I pg_dump the 50 GB table, I get a 40 GB file.
When I pg_dump the 100 GB table, I get a 40 GB file.
I think the possible causes of the problem being with the database have been rather exhausted by now. Maybe the difference is in how the OS was set-up on each system. So, more questions:
What type of file-system are you using on each database (for the table in question)?
Are these filesystems configured identically, or does one perhaps have a different block-size than the other?
Is it set up as a raid array? If so, which raid-level?
Are your dumps going to that same file-system, or to a different one?
Alban Hertroys
--
Screwing up is an excellent way to attach something to the ceiling.
!DSPAM:737,4d6a2d1b11731601256477!
Aleksey Tsalolikhin <atsaloli.tech@gmail.com> writes:
i've installed the "pageinspect" contrib module as Tomas suggested but
I don't know what to do with it or what to look at. I looked at the
manual for it but it's totally over my head right now.
Personally I'd try pgstattuple first:
http://www.postgresql.org/docs/8.4/static/pgstattuple.html
The stats from that for the table and each index should at least let us
narrow down where the bloat is.
regards, tom lane
On Sunday, February 27, 2011 12:49:48 am Aleksey Tsalolikhin wrote:
so looks like it's something low-level, something about how the data is
stored.i've installed the "pageinspect" contrib module as Tomas suggested but
I don't know what to do with it or what to look at. I looked at the
manual for it but it's totally over my head right now.What sections of the manual should I read to be able to use this
module? (there are 2167 pages in the whole Postgres 8.4 manual and I
don't have time to study the whole thing. :( I just need to study
enough to understand how to use pageinspect.)(I'm not a Postgres DBA but I am having to become one to support our
database.)Thanks again for the help. I'll be at SCALE 9x tomorrow helping Joe
Conway and Bruce M and Richard B and company get the word out about
Postgres.
Well it looks like the problem is not with the raw data. So far you have been
using pg_total_relation_size to look at the table. It might be time to look at
the individual components by using the other pg_* functions:
http://www.postgresql.org/docs/9.0/interactive/functions-admin.html
pg_table_size(regclass)
pg_indexes_size(regclass)
pg_relation_size(relation regclass, fork text)
In other words try to narrow down where the bloat is occurring.
Best,
Aleksey
--
Adrian Klaver
adrian.klaver@gmail.com
On Sun, Feb 27, 2011 at 2:52 AM, Alban Hertroys
<dalroi@solfertje.student.utwente.nl> wrote:
Thank you for your kind replies.
I noticed in your table definition that you seem to store timestamps in text-fields. Restoring those from text-fields shouldn't make any difference, but perhaps your locales are set up differently between the machines and cause some type of conversion to take place?
OK, Alban, I'm game. How would I check how locales are set up?
Adrian, I found pg_indexes_size() is only in 9 (I have 8.4) but I got
the same information from a query based on
http://www.issociate.de/board/post/478501/How_much_space_do_database_objects_take_up_in_data_files.html
I used:
SELECT nspname, relname,
pg_size_pretty(tablesize) AS tablesize, pg_size_pretty(indexsize) AS
indexsize, pg_size_pretty(toastsize) AS toastsize,
pg_size_pretty(toastindexsize) AS toastindexsize
FROM
(SELECT ns.nspname, cl.relname, pg_relation_size(cl.oid) AS tablesize,
COALESCE((SELECT SUM(pg_relation_size(indexrelid))::bigint
FROM pg_index WHERE cl.oid=indrelid), 0) AS indexsize,
CASE WHEN reltoastrelid=0 THEN 0
ELSE pg_relation_size(reltoastrelid)
END AS toastsize,
CASE WHEN reltoastrelid=0 THEN 0
ELSE pg_relation_size((SELECT reltoastidxid FROM pg_class ct
WHERE ct.oid = cl.reltoastrelid))
END AS toastindexsize
FROM pg_class cl, pg_namespace ns
WHERE cl.relnamespace = ns.oid
AND ns.nspname NOT IN ('pg_catalog', 'information_schema')
AND cl.relname IN
(SELECT table_name FROM information_schema.tables
WHERE table_type = 'BASE TABLE')) ss
ORDER BY tablesize+indexsize+toastsize+toastindexsize DESC;
Here is what I see:
nspname | relname | tablesize
| indexsize | toastsize | toastindexsize
------------------------+----------------------------------+------------+------------+------------+----------------
public | big | 744 MB
| 737 MB | 48 GB | 278 MB
public | big | 503 MB
| 387 MB | 99 GB | 278 MB
Check out that toastsize delta. What makes up TOAST? How can I
compare the two TOAST tables in detail?
Tom suggested pgstattuple:
table_len | tuple_count | tuple_len | tuple_percent |
dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space |
free_percent
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
779689984 | 1628348 | 500584290 | 64.2 |
30111 | 8275133 | 1.06 | 243295444 |
31.2 <-- database A (source, 50 GB)
527835136 | 1628348 | 500584290 | 94.84 |
0 | 0 | 0 | 9492072 | 1.8
<-- database B ( target, 100 GB)
I used "dumpe2fs" to check the filesystems - block size is 4096 on both servers.
One filesystem is on a hardware raid device, and one is on a software
raid device.
Thanks,
Aleksey
On Monday, February 28, 2011 9:51:10 pm Aleksey Tsalolikhin wrote:
On Sun, Feb 27, 2011 at 2:52 AM, Alban Hertroys
<dalroi@solfertje.student.utwente.nl> wrote:
Thank you for your kind replies.I noticed in your table definition that you seem to store timestamps in
text-fields. Restoring those from text-fields shouldn't make any
difference, but perhaps your locales are set up differently between the
machines and cause some type of conversion to take place?OK, Alban, I'm game. How would I check how locales are set up?
Adrian, I found pg_indexes_size() is only in 9 (I have 8.4) but I got
the same information from a query based on
http://www.issociate.de/board/post/478501/How_much_space_do_database_object
s_take_up_in_data_files.html
Sorry about that, I was not paying attention. FYI 8.4 does have
pg_relation_size() which can be applied against individual indexes.
Here is what I see:
nspname | relname | tablesize
| indexsize | toastsize | toastindexsize
------------------------+----------------------------------+------------+--
----------+------------+---------------- public | big
| 744 MB| 737 MB | 48 GB | 278 MB
public | big | 503 MB
| 387 MB | 99 GB | 278 MB
Check out that toastsize delta. What makes up TOAST? How can I
compare the two TOAST tables in detail?
TOAST is best explained here:
http://www.postgresql.org/docs/8.4/interactive/storage-toast.html
Looks like the TOAST compression is not working on the second machine. Not sure
how that could come to be. Further investigation underway:)
Thanks,
Aleksey
--
Adrian Klaver
adrian.klaver@gmail.com