double precision[] storage space questions
Hi all
I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.
Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 --- as
expected
select pg_column_size('{}'::double precision[]) return 16 --- ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24
select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]) return 104 --- I'am lost because I expected 10*16 + 16 = 176.
It is neither 16+10*8 (96)
So what is happening behind the scene, I did not found any documentation.
Since the queries are done in memory I suppose no compression is going on.
Furthermore select
pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
precision[]) return 104 as well. So I discard compression.
The whole point is that in the application I work on, we store double arrays
as bytea (using some serialization before storing the data).
I was very surprised to see that the storage of an array of double take more
space using double precision[] than serializing it and storing it into a
bytea.
Thanks for any help and docs
--
View this message in context: http://postgresql.nabble.com/double-precision-storage-space-questions-tp5853581.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
greg <gregory.jevardat@unige.ch> writes:
I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.
Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 --- as
expected
select pg_column_size('{}'::double precision[]) return 16 --- ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24
'{}' is a zero-dimensional array so it doesn't have the same
dimensionality information that your third case does. See
the comments at the head of
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/utils/array.h;hb=HEAD
The whole point is that in the application I work on, we store double arrays
as bytea (using some serialization before storing the data).
TBH, that seems like a pretty silly decision. It guarantees that you
cannot do any useful manipulations of the array on the database side.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hi Greg:
On Fri, Jun 12, 2015 at 4:08 PM, greg <gregory.jevardat@unige.ch> wrote:
I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 --- as
expected
select pg_column_size('{}'::double precision[]) return 16 --- ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]) return 104 --- I'am lost because I expected 10*16 + 16 = 176.
It is neither 16+10*8 (96)
That's not a very good set of tests, look at mine ( slightly edited for size );
apc=# select pg_column_size('{}'::double precision[]);
16
apc=# select pg_column_size('{1.0}'::double precision[]);
32
apc=# select pg_column_size('{1.0,2.0}'::double precision[]);
40
apc=# select pg_column_size('{1.0,2.0,3.0}'::double precision[]);
48
apc=# select pg_column_size('{1.0,2.0,3.0,4.0}'::double precision[]);
56
Here I already expect 8*n+24, so the data point for 10
apc=# select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]);
104
Does not surprise me.
So what is happening behind the scene, I did not found any documentation.
Since the queries are done in memory I suppose no compression is going on.
Furthermore select
pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
precision[]) return 104 as well. So I discard compression.
You're hunting for wild things, as said in some other places, headers,
special case for dimensionless arrays, it's a classic thing in
databases.
The whole point is that in the application I work on, we store double arrays
as bytea (using some serialization before storing the data).
I was very surprised to see that the storage of an array of double take more
space using double precision[] than serializing it and storing it into a
bytea.
Not too much, just 20 bytes more per column, unless you play
compression tricks. Unless you have lots of small columns, I doubt
serializing/deserializing it is worth the hassle. Postgres does not
always use the most compact form for storage. In fact I would be
greatly surprised that any database stores an array ( which can be
multidimensional, I do not know if other databases have single
dimensional array types ) in a more compact way than an specialized
serialization format for one dimensional double arrays.
Francisco Olarte.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Francisco Olarte wrote
Hi Greg:
On Fri, Jun 12, 2015 at 4:08 PM, greg <
gregory.jevardat@
> wrote:
I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 ---
as
expected
select pg_column_size('{}'::double precision[]) return 16 ---
ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24select
pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]) return 104 --- I'am lost because I expected 10*16 + 16 =
176.
It is neither 16+10*8 (96)That's not a very good set of tests, look at mine ( slightly edited for
size );apc=# select pg_column_size('{}'::double precision[]);
16
apc=# select pg_column_size('{1.0}'::double precision[]);
32
apc=# select pg_column_size('{1.0,2.0}'::double precision[]);
40
apc=# select pg_column_size('{1.0,2.0,3.0}'::double precision[]);
48
apc=# select pg_column_size('{1.0,2.0,3.0,4.0}'::double precision[]);
56Here I already expect 8*n+24, so the data point for 10
apc=# select
pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]);
104Does not surprise me.
So what is happening behind the scene, I did not found any documentation.
Since the queries are done in memory I suppose no compression is going
on.
Furthermore select
pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
precision[]) return 104 as well. So I discard compression.You're hunting for wild things, as said in some other places, headers,
special case for dimensionless arrays, it's a classic thing in
databases.The whole point is that in the application I work on, we store double
arrays
as bytea (using some serialization before storing the data).
I was very surprised to see that the storage of an array of double take
more
space using double precision[] than serializing it and storing it into a
bytea.Not too much, just 20 bytes more per column, unless you play
compression tricks. Unless you have lots of small columns, I doubt
serializing/deserializing it is worth the hassle. Postgres does not
always use the most compact form for storage. In fact I would be
greatly surprised that any database stores an array ( which can be
multidimensional, I do not know if other databases have single
dimensional array types ) in a more compact way than an specialized
serialization format for one dimensional double arrays.Francisco Olarte.
--
Sent via pgsql-general mailing list (
pgsql-general@
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Thanks a lot for this crystal clear explanation. Indeed my tests were bad, I
should have find it by myself ... So there is 24 extra bytes per column.
Given the average length of my arrays it is not sure that the byte storage
worth it.
Greg
--
View this message in context: http://postgresql.nabble.com/double-precision-storage-space-questions-tp5853581p5853626.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general