Array types

Started by John Listeralmost 17 years ago23 messages
#1John Lister
john.lister-ps@kickstone.com

Hi, using v8.3.5 and a number of client libraries (java, python, pgadmin) and playing about with arrays.

They all return arrays as text, is it possible to configure postgresql to return an array in native form (does postgresql support such a thing)? This is using both the simple and extended query forms - i couldn't see a way to say what return type i wanted in the protocol docs...

This would seem much better in terms of performance, both size and speed(conversion).

Thanks

--

Got needs? Get Goblin'! - http://www.pricegoblin.co.uk/

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Lister (#1)
Re: Array types

"John Lister" <john.lister-ps@kickstone.com> writes:

Hi, using v8.3.5 and a number of client libraries (java, python, pgadmin) and playing about with arrays.
They all return arrays as text, is it possible to configure postgresql
to return an array in native form (does postgresql support such a
thing)?

This is something the client code would request (or not). It would not
be sensible to try to force it from the server side, since if the client
doesn't request it it's likely that the client wouldn't understand the
data format.

regards, tom lane

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#2)
Re: Array types

On Tue, Apr 7, 2009 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"John Lister" <john.lister-ps@kickstone.com> writes:

Hi, using v8.3.5 and a number of client libraries (java, python, pgadmin) and playing about with arrays.
They all return arrays as text, is it possible to configure postgresql
to return an array in native form (does postgresql support such a
thing)?

This is something the client code would request (or not).  It would not
be sensible to try to force it from the server side, since if the client
doesn't request it it's likely that the client wouldn't understand the
data format.

unless, of course, you are using libpqtypes :D

http://libpqtypes.esilo.com/

merlin

#4Andrew Chernow
ac@esilo.com
In reply to: John Lister (#1)
Re: Array types

John Lister wrote:

They all return arrays as text, is it possible to configure postgresql
to return an array in native form (does postgresql support such a
thing)? This is using both the simple and extended query forms - i
couldn't see a way to say what return type i wanted in the protocol docs...

You need libpqtypes (client-side library). It requires a new 8.4
feature called libpq-events, but there is an 8.3 patch available. If
you willing to patch your 8.3.5 libpq client, than this should meet your
needs.

Downloads: http://pgfoundry.org/projects/libpqtypes/
Documentation: http://libpqtypes.esilo.com/

For arrays, libpqtypes gives you a PGresult where each tuple is an array
item. For composite arrays, each composite field is a PGresult field.

This would seem much better in terms of performance, both size and
speed(conversion).

That is true. Our testing has proven this. It also reduces the overall
coding effort.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#5John Lister
john.lister-ps@kickstone.com
In reply to: John Lister (#1)
Re: Array types

This is something the client code would request (or not). It would not
be sensible to try to force it from the server side, since if the client
doesn't request it it's likely that the client wouldn't understand the
data format.

Cheers for the quick reply, any chance of a pointer to the protocol where
the client specifies the return type(s) so i can check the client code
(mainly jdbc) to see what is going on?

Thanks

JOHN

#6John Lister
john.lister-ps@kickstone.com
In reply to: John Lister (#1)
Re: Array types

Does libpqtypes pass the array "over the wire" as an array? Ideally i'd like
to do this with jdbc, but might give me a pointer...

Thanks
----- Original Message -----
From: "Andrew Chernow" <ac@esilo.com>
To: "John Lister" <john.lister-ps@kickstone.com>
Cc: <pgsql-hackers@postgresql.org>
Sent: Tuesday, April 07, 2009 8:15 PM
Subject: Re: [HACKERS] Array types

Show quoted text

John Lister wrote:

They all return arrays as text, is it possible to configure postgresql to
return an array in native form (does postgresql support such a thing)?
This is using both the simple and extended query forms - i couldn't see a
way to say what return type i wanted in the protocol docs...

You need libpqtypes (client-side library). It requires a new 8.4 feature
called libpq-events, but there is an 8.3 patch available. If you willing
to patch your 8.3.5 libpq client, than this should meet your needs.

Downloads: http://pgfoundry.org/projects/libpqtypes/
Documentation: http://libpqtypes.esilo.com/

For arrays, libpqtypes gives you a PGresult where each tuple is an array
item. For composite arrays, each composite field is a PGresult field.

This would seem much better in terms of performance, both size and
speed(conversion).

That is true. Our testing has proven this. It also reduces the overall
coding effort.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#7Merlin Moncure
mmoncure@gmail.com
In reply to: John Lister (#6)
Re: Array types

On Tue, Apr 7, 2009 at 3:35 PM, John Lister
<john.lister-ps@kickstone.com> wrote:

Does libpqtypes pass the array "over the wire" as an array? Ideally i'd like
to do this with jdbc, but might give me a pointer...

We send/receive the server's array format. This is not quite a C
array, and is definitely not a java array. It's a packed postgres
specific (network byte order) format. It is much faster and tighter
than text in some cases however. We present an interface to copy C
arrays to postgres style formats such as:

PGarray a;
PQgetf(res, tuple_num, "%int4[]", field_num, &a);

This will 'pop' a result out of your result that presents the array
internals (which you could then loop).

merlin

#8John Lister
john.lister@kickstone.com
In reply to: Merlin Moncure (#7)
Re: Array types

Cheers, nice to know it is possible... Now to see if i can get
java/python to do the same :) or to use a modified libpq somehow...

Merlin Moncure wrote:

Show quoted text

On Tue, Apr 7, 2009 at 3:35 PM, John Lister
<john.lister-ps@kickstone.com> wrote:

Does libpqtypes pass the array "over the wire" as an array? Ideally i'd like
to do this with jdbc, but might give me a pointer...

We send/receive the server's array format. This is not quite a C
array, and is definitely not a java array. It's a packed postgres
specific (network byte order) format. It is much faster and tighter
than text in some cases however. We present an interface to copy C
arrays to postgres style formats such as:

PGarray a;
PQgetf(res, tuple_num, "%int4[]", field_num, &a);

This will 'pop' a result out of your result that presents the array
internals (which you could then loop).

merlin

#9Andrew Chernow
ac@esilo.com
In reply to: John Lister (#8)
Re: Array types

John Lister wrote:

Cheers, nice to know it is possible... Now to see if i can get
java/python to do the same :) or to use a modified libpq somehow...

If performance is your concern, you would probably get the best results
using the languages C glue interfrace. For instance, in java I think
you would want to create a JNI wrapper to libpqtypes that converts
PGarray to a java array.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#10James Pye
lists@jwp.name
In reply to: John Lister (#8)
Re: Array types

On Apr 7, 2009, at 12:54 PM, John Lister wrote:

Cheers, nice to know it is possible... Now to see if i can get java/
python to do the same :) or to use a modified libpq somehow...

http://python.projects.postgresql.org will do it for Python. =D

tho, only supports Python 3, which is still quite new.

#11Merlin Moncure
mmoncure@gmail.com
In reply to: James Pye (#10)
Re: Array types

On Tue, Apr 7, 2009 at 6:42 PM, James Pye <lists@jwp.name> wrote:

On Apr 7, 2009, at 12:54 PM, John Lister wrote:

Cheers, nice to know it is possible... Now to see if i can get java/python
to do the same :) or to use a modified libpq somehow...

http://python.projects.postgresql.org will do it for Python. =D

tho, only supports Python 3, which is still quite new.

I took a quick look at the pg python driver and was very impressed.
They implemented a full top to bottom binary driver with type i/o
functions in the vein of libpqtypes. If you are writing python, this
is definitely the way to go. Kudos to them for seeing the light and
doing it that way...it's much better coupling with the server than
marshaling everything through text.

merlin

#12John Lister
john.lister-ps@kickstone.com
In reply to: James Pye (#10)
Re: Array types

brilliant i'll give it a go... Now to sort out java :)

James Pye wrote:

Show quoted text

On Apr 7, 2009, at 12:54 PM, John Lister wrote:

Cheers, nice to know it is possible... Now to see if i can get
java/python to do the same :) or to use a modified libpq somehow...

http://python.projects.postgresql.org will do it for Python. =D

tho, only supports Python 3, which is still quite new.

#13John Lister
john.lister-ps@kickstone.com
In reply to: John Lister (#1)
Re: Array types

Following this up, is there any docs on the binary wire format for arrays?

Thanks
----- Original Message -----
From: John Lister
To: pgsql-hackers@postgresql.org
Sent: Tuesday, April 07, 2009 7:54 PM
Subject: [HACKERS] Array types

Hi, using v8.3.5 and a number of client libraries (java, python, pgadmin) and playing about with arrays.

They all return arrays as text, is it possible to configure postgresql to return an array in native form (does postgresql support such a thing)? This is using both the simple and extended query forms - i couldn't see a way to say what return type i wanted in the protocol docs...

This would seem much better in terms of performance, both size and speed(conversion).

Thanks

--

Got needs? Get Goblin'! - http://www.pricegoblin.co.uk/

#14Andrew Chernow
ac@esilo.com
In reply to: John Lister (#13)
Re: Array types

John Lister wrote:

Following this up, is there any docs on the binary wire format for arrays?

None that I know of.

Check out the backend source: (array_recv() and array_send() functions)
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/arrayfuncs.c?rev=1.154

Or, look at libpqtypes array.c:
http://libpqtypes.esilo.com/browse_source.html?file=array.c

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#15Andrew Chernow
ac@esilo.com
In reply to: Andrew Chernow (#14)
Re: Array types

Andrew Chernow wrote:

John Lister wrote:

Following this up, is there any docs on the binary wire format for
arrays?

None that I know of.

Check out the backend source: (array_recv() and array_send() functions)
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/arrayfuncs.c?rev=1.154

Or, look at libpqtypes array.c:
http://libpqtypes.esilo.com/browse_source.html?file=array.c

Forgot to mention, this is not as simple as understanding the array
format. You have to understand the wire format for all types that can
be array elements. The array wire format serializes its array elements
as [elem_len][elem_data]. elem_data is the wire format of the array
element type, like an int, timestamp, polygon, bytea, etc... So once
you unravel the array container format, you still have to demarshal the
type data.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#16Merlin Moncure
mmoncure@gmail.com
In reply to: John Lister (#13)
Re: Array types

On Wed, Apr 8, 2009 at 10:48 AM, John Lister
<john.lister-ps@kickstone.com> wrote:

Following this up, is there any docs on the binary wire format for arrays?

Thanks

Does java wrap libpq? If so, your best bet is probably going to be to
go the libpqtypes route. If you want help doing that, you are more
than welcome to ask (probably should move this thread to the
libqptypes list). If not, you are headed for a 'much bigger than it
looks on the surface' challenge...there are a lot of types...trust me
on this one. If you want help with libpqtypes you can ask on our list
on pgfoundry.

merlin

#17John Lister
john.lister-ps@kickstone.com
In reply to: John Lister (#1)
Re: Array types

Cheers for the pointers. Am i right in thinking that if i get an array of
arrays, the nested arrays are sent in wire format as well - it seems to be
from the docs.

Secondly, comments are a bit scarse in the code, but am i also right in
thinking that an array indexing can start at an arbitrary value? This seems
to be what the lbound value is for... or is this a addition to deal with
nulls eg, {null, null, null, 4} would have a lbound of 3.... (or both)

Thanks

----- Original Message -----
From: "Andrew Chernow" <ac@esilo.com>
To: "John Lister" <john.lister-ps@kickstone.com>
Cc: <pgsql-hackers@postgresql.org>
Sent: Wednesday, April 08, 2009 4:07 PM
Subject: Re: [HACKERS] Array types

Show quoted text

Andrew Chernow wrote:

John Lister wrote:

Following this up, is there any docs on the binary wire format for
arrays?

None that I know of.

Check out the backend source: (array_recv() and array_send() functions)
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/arrayfuncs.c?rev=1.154
Or, look at libpqtypes array.c:
http://libpqtypes.esilo.com/browse_source.html?file=array.c

Forgot to mention, this is not as simple as understanding the array
format. You have to understand the wire format for all types that can be
array elements. The array wire format serializes its array elements as
[elem_len][elem_data]. elem_data is the wire format of the array element
type, like an int, timestamp, polygon, bytea, etc... So once you unravel
the array container format, you still have to demarshal the type data.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#18John Lister
john.lister-ps@kickstone.com
In reply to: John Lister (#1)
Re: Array types

No unfortunately not, it is a JDBC type 4 java which is entirely written in
java. I've patched (as pointed out in another list) the base version to
handle binary data (still a couple of issues that seem unfinished) which has
given me clues, but the patch only supports simple types. I'm looking to
create translator for arrays now..

I was hoping to use the java type handling for the internals of the array
and throw an error on any "unknown" ones... I only use ints and floats in my
arrays, so may leave it at that if it works and look at doing it properly
later...

Show quoted text

On Wed, Apr 8, 2009 at 10:48 AM, John Lister
<john.lister-ps@kickstone.com> wrote:

Following this up, is there any docs on the binary wire format for
arrays?

Thanks

Does java wrap libpq? If so, your best bet is probably going to be to
go the libpqtypes route. If you want help doing that, you are more
than welcome to ask (probably should move this thread to the
libqptypes list). If not, you are headed for a 'much bigger than it
looks on the surface' challenge...there are a lot of types...trust me
on this one. If you want help with libpqtypes you can ask on our list
on pgfoundry.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Andrew Dunstan
andrew@dunslane.net
In reply to: Merlin Moncure (#16)
Re: Array types

Merlin Moncure wrote:

On Wed, Apr 8, 2009 at 10:48 AM, John Lister
<john.lister-ps@kickstone.com> wrote:

Following this up, is there any docs on the binary wire format for arrays?

Thanks

Does java wrap libpq?

No. The JDBC driver is a Type 4 pure java driver. It implements the wire
protocol.

That's is what makes it portable.

Cheers

andrew

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Lister (#17)
Re: Array types

"John Lister" <john.lister-ps@kickstone.com> writes:

Cheers for the pointers. Am i right in thinking that if i get an array of
arrays, the nested arrays are sent in wire format as well - it seems to be
from the docs.

Postgres doesn't have arrays of arrays. There are multi-dimensional
arrays, which aren't conceptually the same thing.

regards, tom lane

#21Greg Stark
stark@enterprisedb.com
In reply to: John Lister (#17)
Re: Array types

On Wed, Apr 8, 2009 at 4:11 PM, John Lister
<john.lister-ps@kickstone.com> wrote:

Cheers for the pointers. Am i right in thinking that if i get an array of
arrays, the nested arrays are sent in wire format as well - it seems to be
from the docs.

No, you can't easily get an array of arrays in Postgres. You can get
multi-dimensional arrays but that's one big array with multiple
dimensions. The text output form does look like an array of arrays
but they don't behave like you might think they would:

postgres=# select array[array[1,2,3,4],array[5,6,7,8]];
array
-----------------------
{{1,2,3,4},{5,6,7,8}}
(1 row)

postgres=# select '{{1,2,3,4},{5,6,7,8}}'::int[];
int4
-----------------------
{{1,2,3,4},{5,6,7,8}}
(1 row)

postgres=# select ('{{1,2,3,4},{5,6,7,8}}'::int[])[1];
int4
------

(1 row)

postgres=# select ('{{1,2,3,4},{5,6,7,8}}'::int[])[1][1];
int4
------
1
(1 row)

Secondly, comments are a bit scarse in the code, but am i also right in
thinking that an array indexing can start at an arbitrary value? This seems
to be what the lbound value is for... or is this a addition to deal with
nulls eg, {null, null, null, 4} would have a lbound of 3.... (or both)

No, nulls are handled using a bitmap inside the array data structure.

Array bounds don't have to start at 1, they can start below 1 or above 1.

postgres=# select ('[-2:-1][5:8]={{1,2,3,4},{5,6,7,8}}'::int[])[-2][5];
int4
------
1
(1 row)

--
greg

#22John Lister
john.lister-ps@kickstone.com
In reply to: John Lister (#1)
Re: Array types

On Wed, Apr 8, 2009 at 4:11 PM, John Lister
<john.lister-ps@kickstone.com> wrote:

Cheers for the pointers. Am i right in thinking that if i get an array of
arrays, the nested arrays are sent in wire format as well - it seems to
be
from the docs.

No, you can't easily get an array of arrays in Postgres. You can get
multi-dimensional arrays but that's one big array with multiple
dimensions. The text output form does look like an array of arrays
but they don't behave like you might think they would:

Cheers, it wasn't clear if you have an array of arrays of which the nested
ones were of a different type. but it looks like all the values have to be
the same type,
eg

select (array[array[1,2,3,4],array['test']])

fails..

this makes life simpler :)

Secondly, comments are a bit scarse in the code, but am i also right in
thinking that an array indexing can start at an arbitrary value? This
seems
to be what the lbound value is for... or is this a addition to deal with
nulls eg, {null, null, null, 4} would have a lbound of 3.... (or both)

No, nulls are handled using a bitmap inside the array data structure.

Array bounds don't have to start at 1, they can start below 1 or above 1.

postgres=# select ('[-2:-1][5:8]={{1,2,3,4},{5,6,7,8}}'::int[])[-2][5];
int4
------

Somehow missed the bounds in the docs. Cheers that has cleared that up...

JOHN

#23Merlin Moncure
mmoncure@gmail.com
In reply to: Greg Stark (#21)
Re: Array types

On Wed, Apr 8, 2009 at 11:35 AM, Greg Stark <stark@enterprisedb.com> wrote:

On Wed, Apr 8, 2009 at 4:11 PM, John Lister
<john.lister-ps@kickstone.com> wrote:

Cheers for the pointers. Am i right in thinking that if i get an array of
arrays, the nested arrays are sent in wire format as well - it seems to be
from the docs.

No, you can't easily get an array of arrays in Postgres. You can get
multi-dimensional arrays but that's one big array with multiple
dimensions.  The text output form does look like an array of arrays
but they don't behave like you might think they would:

one note about that: you can have array of composites with arrays in
them, so you can get arbitrary levels of nesting.

merlin