database size estimates
Hi,
I am trying to estimate the size of a table composed of 51754000 rows.
Each row has 31 attributes: 16 x bit(24) and 15 x bit(8)
So, the payload should be:
51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MB
Now, from what I understand from postgresql manual is that the overhead
is composed of
32 bytes per row + 20 bytes per page.
This leads me to approx. 1700 MB overhead.
Therefore a total table size of 4900 MB.
However, when I load the table, the table reaches the size of 21500 MB,
i.e., 400% of my estimate.
The table has no toast table, no index.
So I am wondering if someone could give me a better estimate.
Cheers,
Francois
Francois Deliege wrote:
Hi,
I am trying to estimate the size of a table composed of 51754000 rows.
Each row has 31 attributes: 16 x bit(24) and 15 x bit(8)So, the payload should be:
51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MB
What data types are those exactly? If those 24-bit fields are in fact
text, varchar, char(x) or other data types that are stored as variable
length fields, the varlen header will take 4 bytes. And then there's
alignment, those 24-bit fields are most almost certainly 4-byte aligned,
which means that there'll be one byte of padding between them.
The upcoming 8.3 release will be much better in that respect, It'll use
just a 1 byte varlen header per field instead of 4 bytes for small
values like yours. You might want to test a CVS snapshot.
Now, from what I understand from postgresql manual is that the overhead
is composed of
32 bytes per row + 20 bytes per page.
This leads me to approx. 1700 MB overhead.
Therefore a total table size of 4900 MB.
In addition, there will be on average 1/2 rows worth of wasted space on
every page.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
Francois Deliege wrote:
Hi,
I am trying to estimate the size of a table composed of 51754000 rows.
Each row has 31 attributes: 16 x bit(24) and 15 x bit(8)So, the payload should be:
51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MBWhat data types are those exactly? If those 24-bit fields are in fact text,
varchar, char(x) or other data types that are stored as variable length fields,
And sadly that includes bit() if you're being literal.
As of Postgres 8.1 you can see how much space a column is taking up using the
pg_column_size() function. This won't include alignment padding but will
include the length header for that column.
You can see how much a given row is taking up by passing the entire row to
og_column_size with something like pg_column_size(tab.*)
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com