size of bytea + performance issues
Hi everyone,
my first question concerns the the size of a table with a bytea row. In
the documentation it says something like 4 Bytes + 1 Byte for each
escaped octet sequence per row. For example an insertion into a table
storing just one column with bytea data looks like this:
insert into test values (''\\003''); ---- this allows 185 inserts per
page until a new one is needed
insert into test values (''\\003\\123\\123\\111''); ---- this yields
exactly the same nr. of rows per page!
insert into test values (''\\003\\123\\123\\111\\001''); ---- this one
finally needs more pages!
How can that be, if the system internally allocates ONE BYTE per
octet...
Does it acually take 4 Bytes?
My second question is more generall:
My dbms (acutally it is supposed to be a decision support system, so I
do not really need rollbacks and transactions etc. --- can those
features be turned off to enhance performance?) needs to store 1
Billion rows in a single table (I know that I could use horizontal
partitioning) and I wonder if postgres is powerfull enough to handle
such large tables? If anyone has experience with tables this size and
could give me a hint which system to use (DB2, Sybase, Oracle,
Informix, Postgres) this would be great...
THX
tschak
On Tue, Jan 31, 2006 at 07:58:30AM -0800, tschak wrote:
Hi everyone,
my first question concerns the the size of a table with a bytea row. In
the documentation it says something like 4 Bytes + 1 Byte for each
escaped octet sequence per row. For example an insertion into a table
storing just one column with bytea data looks like this:
insert into test values (''\\003''); ---- this allows 185 inserts per
page until a new one is needed
insert into test values (''\\003\\123\\123\\111''); ---- this yields
exactly the same nr. of rows per page!
insert into test values (''\\003\\123\\123\\111\\001''); ---- this one
finally needs more pages!How can that be, if the system internally allocates ONE BYTE per
octet...
Does it acually take 4 Bytes?
Alignment. A new row needs to start on a multiple of 4 (or 8) boundary.
So even though your data value might take 5 bytes, you may end up with
some slack before the next tuple.
My second question is more generall:
My dbms (acutally it is supposed to be a decision support system, so I
do not really need rollbacks and transactions etc. --- can those
features be turned off to enhance performance?) needs to store 1
Well, you might need to take this up on the -performance list, but
there are people running databases that large. In general such features
can't be turned off, though they don't really cost much on data that
doesn't change.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
Allright... do you mean an alignment with an offset of 4/8 bit or byte?
If it is just bit I cannot really follow the calculation...
Nevertheless it sounds like an explanation for this "effect".
I have one more question concerning size. The following table
create table test(val0 unsigned smallint,
... ... ... .. ..., val7 unsigned smallint,
bulk bytea);
needs app. 100 Bytes per row even though the user data only uses
8*2Bytes (vali) + 40Bytes (the sizye of my bytea). Does that mean, that
the system needs 46 bytes for internal represantationsor is the
alignment factor a again a source of this "blowup"?
Thanks for your help,
Tschak
On Tue, Jan 31, 2006 at 09:15:18AM -0800, tschak wrote:
Allright... do you mean an alignment with an offset of 4/8 bit or byte?
If it is just bit I cannot really follow the calculation...
Nevertheless it sounds like an explanation for this "effect".I have one more question concerning size. The following table
create table test(val0 unsigned smallint,
... ... ... .. ..., val7 unsigned smallint,
bulk bytea);needs app. 100 Bytes per row even though the user data only uses
8*2Bytes (vali) + 40Bytes (the sizye of my bytea). Does that mean, that
the system needs 46 bytes for internal represantationsor is the
alignment factor a again a source of this "blowup"?
Well, it's in the FAQ under "How much database disk space is required
to store data from a typical text file?" but the per tuple overhead is
between 36 and 44 bytes. depends a bit on the version.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
Well, it's in the FAQ under "How much database disk space is required
to store data from a typical text file?" but the per tuple overhead is
between 36 and 44 bytes. depends a bit on the version.
allright... thx a lot!
I did not see this point in the FAQ because I am not storing data from
a textfile and did not look in that direction... What store is
basically a encoding scheme of several hash bins which store
distances. Thus it is just a bitvector and to the best of my knowledge
a bytea or varyiing bitarray is the most space efficient method to
store my information(???).
Thanks again,
Tschak
On Tue, Jan 31, 2006 at 08:15:44PM +0100, Jochen Schlosser wrote:
Well, it's in the FAQ under "How much database disk space is required
to store data from a typical text file?" but the per tuple overhead is
between 36 and 44 bytes. depends a bit on the version.allright... thx a lot!
I did not see this point in the FAQ because I am not storing data from
a textfile and did not look in that direction... What store is
basically a encoding scheme of several hash bins which store
distances. Thus it is just a bitvector and to the best of my knowledge
a bytea or varyiing bitarray is the most space efficient method to
store my information(???).
It is if it's truely variable in length. If you know it's limited to say
4 bytes, you'd probably be better off with an int4, which doesn't have
the varlena overhead (4 bytes) that a bytea does.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461