double precision[] storage space questions

Started by gregalmost 11 years ago4 messagesgeneral
Jump to latest
#1greg
gregory.jevardat@unige.ch

Hi all

I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.

Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 --- as
expected
select pg_column_size('{}'::double precision[]) return 16 --- ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24

select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]) return 104 --- I'am lost because I expected 10*16 + 16 = 176.
It is neither 16+10*8 (96)

So what is happening behind the scene, I did not found any documentation.
Since the queries are done in memory I suppose no compression is going on.
Furthermore select
pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
precision[]) return 104 as well. So I discard compression.

The whole point is that in the application I work on, we store double arrays
as bytea (using some serialization before storing the data).
I was very surprised to see that the storage of an array of double take more
space using double precision[] than serializing it and storing it into a
bytea.

Thanks for any help and docs

--
View this message in context: http://postgresql.nabble.com/double-precision-storage-space-questions-tp5853581.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: greg (#1)
Re: double precision[] storage space questions

greg <gregory.jevardat@unige.ch> writes:

I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.

Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 --- as
expected
select pg_column_size('{}'::double precision[]) return 16 --- ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24

'{}' is a zero-dimensional array so it doesn't have the same
dimensionality information that your third case does. See
the comments at the head of
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/utils/array.h;hb=HEAD

The whole point is that in the application I work on, we store double arrays
as bytea (using some serialization before storing the data).

TBH, that seems like a pretty silly decision. It guarantees that you
cannot do any useful manipulations of the array on the database side.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Francisco Olarte
folarte@peoplecall.com
In reply to: greg (#1)
Re: double precision[] storage space questions

Hi Greg:

On Fri, Jun 12, 2015 at 4:08 PM, greg <gregory.jevardat@unige.ch> wrote:

I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.

Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 --- as
expected
select pg_column_size('{}'::double precision[]) return 16 --- ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24

select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]) return 104 --- I'am lost because I expected 10*16 + 16 = 176.
It is neither 16+10*8 (96)

That's not a very good set of tests, look at mine ( slightly edited for size );

apc=# select pg_column_size('{}'::double precision[]);
16
apc=# select pg_column_size('{1.0}'::double precision[]);
32
apc=# select pg_column_size('{1.0,2.0}'::double precision[]);
40
apc=# select pg_column_size('{1.0,2.0,3.0}'::double precision[]);
48
apc=# select pg_column_size('{1.0,2.0,3.0,4.0}'::double precision[]);
56

Here I already expect 8*n+24, so the data point for 10

apc=# select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]);
104

Does not surprise me.

So what is happening behind the scene, I did not found any documentation.
Since the queries are done in memory I suppose no compression is going on.
Furthermore select
pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
precision[]) return 104 as well. So I discard compression.

You're hunting for wild things, as said in some other places, headers,
special case for dimensionless arrays, it's a classic thing in
databases.

The whole point is that in the application I work on, we store double arrays
as bytea (using some serialization before storing the data).
I was very surprised to see that the storage of an array of double take more
space using double precision[] than serializing it and storing it into a
bytea.

Not too much, just 20 bytes more per column, unless you play
compression tricks. Unless you have lots of small columns, I doubt
serializing/deserializing it is worth the hassle. Postgres does not
always use the most compact form for storage. In fact I would be
greatly surprised that any database stores an array ( which can be
multidimensional, I do not know if other databases have single
dimensional array types ) in a more compact way than an specialized
serialization format for one dimensional double arrays.

Francisco Olarte.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4greg
gregory.jevardat@unige.ch
In reply to: Francisco Olarte (#3)
Re: double precision[] storage space questions

Francisco Olarte wrote

Hi Greg:

On Fri, Jun 12, 2015 at 4:08 PM, greg &lt;

gregory.jevardat@

&gt; wrote:

I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.

Here are a few tries I did to understand
select pg_column_size(1.1::double precision) return 8 ---
as
expected
select pg_column_size('{}'::double precision[]) return 16 ---
ok
maybe an array header
select pg_column_size('{1.111}'::double precision[]) return 32 --- I
expected 16+ sizeof(double) = 24

select
pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]) return 104 --- I'am lost because I expected 10*16 + 16 =
176.
It is neither 16+10*8 (96)

That's not a very good set of tests, look at mine ( slightly edited for
size );

apc=# select pg_column_size('{}'::double precision[]);
16
apc=# select pg_column_size('{1.0}'::double precision[]);
32
apc=# select pg_column_size('{1.0,2.0}'::double precision[]);
40
apc=# select pg_column_size('{1.0,2.0,3.0}'::double precision[]);
48
apc=# select pg_column_size('{1.0,2.0,3.0,4.0}'::double precision[]);
56

Here I already expect 8*n+24, so the data point for 10

apc=# select
pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]);
104

Does not surprise me.

So what is happening behind the scene, I did not found any documentation.
Since the queries are done in memory I suppose no compression is going
on.
Furthermore select
pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
precision[]) return 104 as well. So I discard compression.

You're hunting for wild things, as said in some other places, headers,
special case for dimensionless arrays, it's a classic thing in
databases.

The whole point is that in the application I work on, we store double
arrays
as bytea (using some serialization before storing the data).
I was very surprised to see that the storage of an array of double take
more
space using double precision[] than serializing it and storing it into a
bytea.

Not too much, just 20 bytes more per column, unless you play
compression tricks. Unless you have lots of small columns, I doubt
serializing/deserializing it is worth the hassle. Postgres does not
always use the most compact form for storage. In fact I would be
greatly surprised that any database stores an array ( which can be
multidimensional, I do not know if other databases have single
dimensional array types ) in a more compact way than an specialized
serialization format for one dimensional double arrays.

Francisco Olarte.

--
Sent via pgsql-general mailing list (

pgsql-general@

)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Thanks a lot for this crystal clear explanation. Indeed my tests were bad, I
should have find it by myself ... So there is 24 extra bytes per column.
Given the average length of my arrays it is not sure that the byte storage
worth it.

Greg

--
View this message in context: http://postgresql.nabble.com/double-precision-storage-space-questions-tp5853581p5853626.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general