SQL99 ARRAY support proposal

Started by Joe Conwayabout 23 years ago56 messageshackers
Jump to latest
#1Joe Conway
mail@joeconway.com

I'd like to implement SQL99/200x ARRAY support (well, at least the
basics). Does anyone have objections to the following grammar/semantics?

===========================================================
Per SQL200x - examples
===========================================================
create table foo(f1 integer ARRAY[3]);
result same as create table foo(f1 integer[3]); the 3
really does nothing (this deviates from spec but is
same as current implementation)
create table foo(f1 integer ARRAY);
result same as create table foo(f1 integer[]);
select ARRAY[1,2,3];
result '{1,2,3}'
select ARRAY[(select oid from pg_class order by relname)];
result is array of all the oid's in pg_class in relname order
select ARRAY[1,2] || 3
result '{1,2,3}'
select ARRAY[1,2] || ARRAY[3,4]
result '{1,2,3,4}'

===========================================================
Proposed PostgreSQL extensions
===========================================================
select ARRAY[[1,2,3], [4,5,6]];
result '{{1,2,3},{4,5,6}}'
select ARRAY[[1,2],[3,4]] || 5
result '{{1},{3},{5}}'
select ARRAY[[1,2],[3,4]] || [5,6]
result '{{1,2},{3,4},{5,6}}'
use UNION's algorithm for deriving the element type when not specified

Comments, suggestions, objections?

Thanks,

Joe

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#1)
Re: SQL99 ARRAY support proposal

Joe Conway <mail@joeconway.com> writes:

select ARRAY[1,2,3];
result '{1,2,3}'

The array type is determined how? I'd like this syntax better if there
were a way to force the choice of array type...

select ARRAY[(select oid from pg_class order by relname)];
result is array of all the oid's in pg_class in relname order

Puh-leez tell me that's not in the spec. How is one supposed to
distinguish this usage from the scalar-subselect case?

select ARRAY[1,2] || 3
result '{1,2,3}'

Datatypes? How many variants of the || operator do you plan to offer?
What will be the side-effects on the parser's ability to pick one?

select ARRAY[1,2] || ARRAY[3,4]
result '{1,2,3,4}'

Same worry as above; likewise for the "proposed extensions".

regards, tom lane

#3Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#2)
Re: SQL99 ARRAY support proposal

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

select ARRAY[1,2,3];
result '{1,2,3}'

The array type is determined how? I'd like this syntax better if there
were a way to force the choice of array type...

What about:
select integer ARRAY[1,2,3];
result '{1,2,3}'::integer[]

select ARRAY[(select oid from pg_class order by relname)];
result is array of all the oid's in pg_class in relname order

Puh-leez tell me that's not in the spec. How is one supposed to
distinguish this usage from the scalar-subselect case?

Well, SQL99 has this:
<array value constructor> ::=
<array value list constructor>
<array value list constructor> ::=
ARRAY <left bracket or trigraph>
<array element list>
<right bracket or trigraph>

but SQL200x has this:

<array value constructor> ::=
<array value constructor by enumeration> |
<array value constructor by query>
<array value constructor by enumeration> ::=
ARRAY <left bracket or trigraph>
<array element list>
<right bracket or trigraph>
<array value constructor by query> ::=
ARRAY <left paren>
<query expression> [ <order by clause> ]
<right paren>

select ARRAY[1,2] || 3
result '{1,2,3}'

Datatypes?

maybe?

select integer ARRAY[1,2] || 3
result '{1,2,3}'::integer[]

How many variants of the || operator do you plan to offer?

One for each builtin datatype[]/datatype pair (e.g. integer[]/integer),
and another for each datatype[] (e.g. integer[]/integer[])

What will be the side-effects on the parser's ability to pick one?

Not really sure. I figured I'd cross that bridge when I got to it. Are
there specific landmines you're thinking of in there?

select ARRAY[1,2] || ARRAY[3,4]
result '{1,2,3,4}'

select integer ARRAY[1,2] || integer ARRAY[3,4]
result '{1,2,3,4}'::integer[]

Or else the use UNION's algorithm for deriving the element type (you
suggested this previously, but I may have misunderstood)

Joe

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#3)
Re: SQL99 ARRAY support proposal

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

The array type is determined how? I'd like this syntax better if there
were a way to force the choice of array type...

What about:
select integer ARRAY[1,2,3];
result '{1,2,3}'::integer[]

By analogy to the "type 'literal'" syntax? I'd prefer not to go that
route, because that syntax for literals is a horrid kluge --- to keep
bison from spitting up, we've had to put a bunch of nasty restrictions
on the type names that can appear in such constructs. All those
restrictions would have to apply here, too.

It's possible that we could use the other cast syntaxes:
ARRAY[1,2,3]::integer[]
CAST(ARRAY[1,2,3] AS integer[])
It would take some hackery to propagate the destination type down into
the ARRAY[] before the latter's type resolution is done, but at least
it'd be a quite localized hack.

Puh-leez tell me that's not in the spec. How is one supposed to
distinguish this usage from the scalar-subselect case?

but SQL200x has this:

<array value constructor> ::=
<array value constructor by enumeration> |
<array value constructor by query>
<array value constructor by enumeration> ::=
ARRAY <left bracket or trigraph>
<array element list>
<right bracket or trigraph>
<array value constructor by query> ::=
ARRAY <left paren>
<query expression> [ <order by clause> ]
<right paren>

This I could live with --- note the difference in punctuation. There
would be a clear syntactic and semantic difference between
ARRAY(SELECT ...) and ARRAY[(SELECT ...)].

How many variants of the || operator do you plan to offer?

One for each builtin datatype[]/datatype pair (e.g. integer[]/integer),
and another for each datatype[] (e.g. integer[]/integer[])

That does not seem workable at all. It'll be a maintenance nightmare
even worse than what we already have for array equality; and I suspect
you'll find that the parser will have trouble resolving which || to pick.

I have been toying with the idea of replacing all the array-equality
operators with a single pg_proc/pg_operator entry for "ANYARRAY = ANYARRAY".
Now that arrays carry their element type in the header, a single
function could implement all these variants. Type mismatch (eg,
int[] = float[]) would become a runtime error instead of a parse error,
but that seems acceptable.

I see how the array || array cases could be handled similarly, but I
don't see how to make array || scalar work that way. If we declared an
operator ANYARRAY || ANY then there'd be no way for it to check at
runtime that the right-hand operand matched the lefthand element type.
(Also, it'd create resolution conflicts with ANYARRAY || ANYARRAY.)

It'd be nice to have a general-purpose solution for this problem.
If we did, the stuff in contrib/array would benefit too, and perhaps
have (at last) a shot at becoming mainstream.

Speculating wildly: what if we invent a new pseudotype ANYARRAYELEMENT,
which would be allowed only in operator/function declarations that also
use ANYARRAY, eg
ANYARRAY || ANYARRAYELEMENT
and would match only the element type of the array in the ANYARRAY
position. I think it'd be possible to hack the parser to make such a
check in parse_coerce. There might be a cleaner way to do this, but
I do feel that something along this line is a better approach than
creating N different pg_operator entries for the same operation.

regards, tom lane

#5Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#4)
Re: SQL99 ARRAY support proposal

Tom Lane wrote:

By analogy to the "type 'literal'" syntax? I'd prefer not to go that
route, because that syntax for literals is a horrid kluge --- to keep
bison from spitting up, we've had to put a bunch of nasty restrictions
on the type names that can appear in such constructs. All those
restrictions would have to apply here, too.

It's possible that we could use the other cast syntaxes:
ARRAY[1,2,3]::integer[]
CAST(ARRAY[1,2,3] AS integer[])
It would take some hackery to propagate the destination type down into
the ARRAY[] before the latter's type resolution is done, but at least
it'd be a quite localized hack.

OK -- I'll try to make that work. I presume that in the non-specified
case "ARRAY[1,2,3]" I should use something similar to UNION's resolution
rules?

<array value constructor by enumeration> ::=
ARRAY <left bracket or trigraph>
<array element list>
<right bracket or trigraph>
<array value constructor by query> ::=
ARRAY <left paren>
<query expression> [ <order by clause> ]
<right paren>

This I could live with --- note the difference in punctuation. There
would be a clear syntactic and semantic difference between
ARRAY(SELECT ...) and ARRAY[(SELECT ...)].

Sorry -- I guess I mis-read that. So "ARRAY(SELECT ...)" it is.

[...lots of good ideas regarding generalizing array operations...]

I played with generalizing array functions a bit for plr and ran into
some problems (which I can't specifically recall at the moment), but
clearly that's the way to go. I'll start playing with your suggestions
in C code, and report back for more feedback as it solidifies.

Thanks!

Joe

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#5)
Re: SQL99 ARRAY support proposal

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

It would take some hackery to propagate the destination type down into
the ARRAY[] before the latter's type resolution is done, but at least
it'd be a quite localized hack.

OK -- I'll try to make that work. I presume that in the non-specified
case "ARRAY[1,2,3]" I should use something similar to UNION's resolution
rules?

Yeah, that'd make sense to me.

[...lots of good ideas regarding generalizing array operations...]

I played with generalizing array functions a bit for plr and ran into
some problems (which I can't specifically recall at the moment), but
clearly that's the way to go. I'll start playing with your suggestions
in C code, and report back for more feedback as it solidifies.

It'd be useful if you can reconstruct what problems you ran into.

After more thought I'm thinking that we should specify these pseudotypes
like so: a given set of actual operand types matches an
operator/function whose definition uses ANYARRAY or ANYELEMENT only if
the same element type is involved at all positions. Thus in
ANYARRAY = ANYARRAY
the left and right operands must be the same array type (so we preserve
the existing parse-time check, but we only need one pg_operator entry
to do it). For
ANYARRAY || ANYELEMENT
we get the desired behavior. And there are some other interesting
possibilities: for example, a function could be declared
foo(ANY, ANYELEMENT, ANYELEMENT)
which would mean that it takes any three datatypes, but the second
and third arguments have to be the same datatype.

If we run into limitations of this scheme, it could be generalized
further by adding pseudotypes ANYARRAY2 and ANYELEMENT2 that have
the same restriction among themselves, but are independent of
ANYARRAY/ANYELEMENT. Then for example
foo(ANYELEMENT, ANYELEMENT, ANYELEMENT2, ANYELEMENT2)
takes four params that can be anything so long as the first two are
the same datatype and the second two are also the same datatype (but
possibly a different type from the first two). And you could add
ANYARRAY3, etc, till you got bored. But I'd not bother with this
until someone showed a concrete need for it. At the moment, the
useful examples I can think of don't need more than one "free variable"
in their set of argument datatypes.

regards, tom lane

#7Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#6)
Re: SQL99 ARRAY support proposal

Tom Lane wrote:

I played with generalizing array functions a bit for plr and ran into
some problems (which I can't specifically recall at the moment), but
clearly that's the way to go. I'll start playing with your suggestions
in C code, and report back for more feedback as it solidifies.

It'd be useful if you can reconstruct what problems you ran into.

I've played around a bit more and refreshed my memory -- here are two
problems:

CREATE OR REPLACE FUNCTION array_push (anyarray, any)
RETURNS anyarray
AS '$libdir/plr','array_push'
LANGUAGE 'C';
ERROR: parser: parse error at or near "any" at character 50

It seems that "any" is not accepted as a function parameter type. From
gram.y it appears that the cause is that "any" is a reserved keyword:

<snip>
/*
* Name classification hierarchy.
*
* IDENT is the lexeme returned by the lexer for identifiers that match
* no known keyword. In most cases, we can accept certain keywords as
</snip>

<snip>
/* Type identifier --- names that can be type names.
*/
type_name: IDENT { $$ = $1; }
| unreserved_keyword { $$ = pstrdup($1); }
;
</snip>

So for grins I did this:
regression=# select oid,typname from pg_type where typname like '%any%';
oid | typname
------+----------
2276 | any
2277 | anyarray
(2 rows)

regression=# update pg_type set typname = 'anyscalar' where oid = 2276;
UPDATE 1

CREATE OR REPLACE FUNCTION array_push (anyarray, anyscalar)
RETURNS anyarray
AS '$libdir/plr','array_push'
LANGUAGE 'C';

regression=# select array_push('{1,2}'::integer[],3::integer);
array_push
------------
{1,2,3}
(1 row)

So far, so good. But now the second problem:
select f1[2] from
(select array_push('{1,2}'::integer[],3::integer) as f1) as t;
ERROR: transformArraySubscripts: type anyarray is not an array

I'm just starting to dig into this one.

Joe

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#7)
Re: SQL99 ARRAY support proposal

Joe Conway <mail@joeconway.com> writes:

It seems that "any" is not accepted as a function parameter type.

You have to double-quote it.

We could perhaps rename it to avoid the keyword conflict; I'd lean
towards "anytype" if we do ("anyscalar" seems misleading; I'd expect
that to exclude arrays). I think I chose ANY because there was
precedent in CREATE AGGREGATE for that.

So far, so good. But now the second problem:
select f1[2] from
(select array_push('{1,2}'::integer[],3::integer) as f1) as t;
ERROR: transformArraySubscripts: type anyarray is not an array

Mph. I'm not sure we can make that work readily ... unless you want
to make the parser assume that a function taking and returning ANYARRAY
actually returns the same array type as its input is. Otherwise the
parser has no way to determine the datatype yielded by f1[2].

regards, tom lane

#9Hannu Krosing
hannu@tm.ee
In reply to: Joe Conway (#7)
Re: SQL99 ARRAY support proposal

Joe Conway kirjutas E, 10.03.2003 kell 05:35:

CREATE OR REPLACE FUNCTION array_push (anyarray, anyscalar)
RETURNS anyarray
AS '$libdir/plr','array_push'
LANGUAGE 'C';

could you make it

RETURNS typeof($1)

?

--------------
Hannu

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#9)
Re: SQL99 ARRAY support proposal

Hannu Krosing <hannu@tm.ee> writes:

Joe Conway kirjutas E, 10.03.2003 kell 05:35:

CREATE OR REPLACE FUNCTION array_push (anyarray, anyscalar)
RETURNS anyarray

could you make it
RETURNS typeof($1)

Not directly --- we have to fit the return-type info into an OID field.
We could fake it by inventing one or more pseudotypes, "SAMEASPARAMn".

But I think I like better the notion of extending my bound-together-
ANYARRAY-and-ANYELEMENT proposal,
http://archives.postgresql.org/pgsql-hackers/2003-03/msg00319.php

Suppose that we do that, and then further say that ANYARRAY or
ANYELEMENT appearing as the return type implies that the return type
is actually the common element or array type. Then we have such
useful behaviors as:

array_push(anyarray, anyelement) returns anyarray
array_pop(anyarray) returns anyelement
array_subscript(anyarray, int) yields anyelement
singleton_array(anyelement) yields anyarray

The last three cases cannot be handled by a SAMEASPARAM construct.

regards, tom lane

#11Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#10)
Re: SQL99 ARRAY support proposal

Tom Lane wrote:

But I think I like better the notion of extending my bound-together-
ANYARRAY-and-ANYELEMENT proposal,
http://archives.postgresql.org/pgsql-hackers/2003-03/msg00319.php

Suppose that we do that, and then further say that ANYARRAY or
ANYELEMENT appearing as the return type implies that the return type
is actually the common element or array type. Then we have such
useful behaviors as:

array_push(anyarray, anyelement) returns anyarray
array_pop(anyarray) returns anyelement
array_subscript(anyarray, int) yields anyelement
singleton_array(anyelement) yields anyarray

The last three cases cannot be handled by a SAMEASPARAM construct.

That was my concern also. I like the above.

So if I understand correctly, all instances of anyarray and anyelement
in a function definition would need to be self-consistent, but the group
could represent essentially any datatype with its corresponding array
type. If we need more than one of these self consistent groups, we could
resort to anyarray1/anyelement1, etc. Does this sound correct?

Also, an implementation question: if I have a type oid for an element,
what is the preferred method for determining the corresponding array?
I'm thinking that the most efficient method might be to use the
element-type name with a '_' prepended to get the array-type oid, but
that seems ugly. Thoughts?

Thanks,

Joe

#12Jason M. Felice
jfelice@cronosys.com
In reply to: Joe Conway (#11)
Re: SQL99 ARRAY support proposal

On Mon, Mar 10, 2003 at 09:49:47AM -0500, Tom Lane wrote:

Hannu Krosing <hannu@tm.ee> writes:

Joe Conway kirjutas E, 10.03.2003 kell 05:35:

CREATE OR REPLACE FUNCTION array_push (anyarray, anyscalar)
RETURNS anyarray

could you make it
RETURNS typeof($1)

Not directly --- we have to fit the return-type info into an OID field.
We could fake it by inventing one or more pseudotypes, "SAMEASPARAMn".

But I think I like better the notion of extending my bound-together-
ANYARRAY-and-ANYELEMENT proposal,
http://archives.postgresql.org/pgsql-hackers/2003-03/msg00319.php

Suppose that we do that, and then further say that ANYARRAY or
ANYELEMENT appearing as the return type implies that the return type
is actually the common element or array type. Then we have such
useful behaviors as:

array_push(anyarray, anyelement) returns anyarray
array_pop(anyarray) returns anyelement
array_subscript(anyarray, int) yields anyelement
singleton_array(anyelement) yields anyarray

The last three cases cannot be handled by a SAMEASPARAM construct.

... typeof($1)[], or a ARRAYELEMSAMEASPARAM construct?

I'm really liking this discussion. I know this is sort of "out there", but
I have found in languages like StandardML and Objective CAML that templatized-
type functions are _extremely_ useful. These languages type systems are
amazingly powerful (the language syntax is another matter *sigh*).

I'm not necessarily suggesting implementing this, but I just want to feed the
debate a bit. I view the type system of these guys as "the ideal", and would
be in ecstacy if PostgreSQL had it, but I realize implementing the thing would
prolly be far from practical.

First, there are templatized types. Arrays in PostgreSQL are sort of a
kludge of templatized types, but they would be defined like so:

type a' array = <some definition ...>

which means that you are describing an array of some type a' (the apostrophe
indicates a type variable).

You can also create other neat templatized types as an aside:

type a' Nullable = Null | Value of a'

Which means the expressions:
Value 47 --> of type int Nullable
Null --> of type a' Nullable (determined from context)

But then, you could also say:

int array array

Or even:

int Nullable array

Which is somthing you can't in PostgreSQL but would be very nice. But then
you could say:

let invert_matrix m : a' array array -> a' array array = <code...>

let multiply x : a', y : a' -> a' = <code ...>

You could have more than one type variable in a templatized type or function,
true, but I've never really needed more than one. I can imagine cases where
it would be useful, but just haven't needed one.

Plus:
* get rid of horrible 'int4_' type hacks for array.
Minus:
* can't use oid to represent exact type, rather a string of oids.
* need second table to hold function type constraints when function
is templatized. (or could make it params "oid array array", aka
oid[][]!) Reserve eight or ten oids for template parameter slots
(in other words, for a' through j' or something).

Warning: I have been called the "type nazi" <g>

One other thing from StandardML that I have always wanted in PostgreSQL
(or anywhere else I program, for that matter)- record types. (Warning, this is
also very wishful thinking and "out there").

In ML/CAML, a record type is defined like so:

type myrecord = {
x : int,
y : int,
s : string
};

"myrecord" is actually just type alias, the canonical record definition is:

{s:string, x:int, y:int}

... with the attributes in alphabetical order, because unless you are mucking
with pointers in C, it really doesn't matter what order they are in. The
first advantage become very apparent: Any two records with the same named
attributes of the same types are always of the same type. In PostgreSQL,
this would mean that functions that operate on RECORD{x:int,y:int,s:string}
could operate on a record from any relation with those attributes.

Further, to make inheritance pretty much unnecesary, you could allow a
record with more attributes to satisfy a parameter or return value constraint.
In other words, you could call function foo(RECORD{x:int,y:int}) on a
RECORD{s:string,x:int,y:int}.

I've thought about this trick a lot. In theory there is a possibility of
not getting what you want, but in practice it would almost never happen. The
demostrative case would be calling distance_from_origin(RECORD{x:int,y:int})
on RECORD{x:int,y:int,z:int}, but in this case you need to make a
distance_from_origin(RECORD{x:int,y:int,z:int}).

This way, you could make a function which operates on RECORD{oid:oid} which
could be called on any record from a table. I've wanted to do this sort of
thing on several occasions- one application has notes which can be attached
to any row, sort of like PostgreSQL comments. Another to keep track of what
user changed which fields. Etc.

Alright.... heheh I didn't mean to ramble. /ME gets head out of clouds
and goes back to practical work <g>

Show quoted text

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordo0mo@postgresql.org

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#11)
Re: SQL99 ARRAY support proposal

Joe Conway <mail@joeconway.com> writes:

So if I understand correctly, all instances of anyarray and anyelement
in a function definition would need to be self-consistent, but the group
could represent essentially any datatype with its corresponding array
type. If we need more than one of these self consistent groups, we could
resort to anyarray1/anyelement1, etc. Does this sound correct?

Right.

Also, an implementation question: if I have a type oid for an element,
what is the preferred method for determining the corresponding array?
I'm thinking that the most efficient method might be to use the
element-type name with a '_' prepended to get the array-type oid, but
that seems ugly. Thoughts?

I was thinking about that earlier. Right now there is no other way.
We could consider adding a column to pg_type to link to the array type,
but I'm not sure it's worth the trouble. I think if you look around,
there's probably already a subroutine someplace that does the lookup
using the '_foo' approach.

regards, tom lane

#14Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Joe Conway (#1)
Re: SQL99 ARRAY support proposal

So if I understand correctly, all instances of anyarray and anyelement
in a function definition would need to be self-consistent, but the group
could represent essentially any datatype with its corresponding array
type. If we need more than one of these self consistent groups, we could
resort to anyarray1/anyelement1, etc. Does this sound correct?

Also, an implementation question: if I have a type oid for an element,
what is the preferred method for determining the corresponding array?
I'm thinking that the most efficient method might be to use the
element-type name with a '_' prepended to get the array-type oid, but
that seems ugly. Thoughts?

What about a cast? 1::arraytype

Chris

#15Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#10)
Re: SQL99 ARRAY support proposal

Tom Lane wrote:

But I think I like better the notion of extending my bound-together-
ANYARRAY-and-ANYELEMENT proposal,
http://archives.postgresql.org/pgsql-hackers/2003-03/msg00319.php

Suppose that we do that, and then further say that ANYARRAY or
ANYELEMENT appearing as the return type implies that the return type
is actually the common element or array type. Then we have such
useful behaviors as:

array_push(anyarray, anyelement) returns anyarray
array_pop(anyarray) returns anyelement
array_subscript(anyarray, int) yields anyelement
singleton_array(anyelement) yields anyarray

Before I get too far along, I'd like to get some feedback. The attached
patch implements Tom's bound-together-ANYARRAY-and-ANYELEMENT proposal
(and includes ANY as well, per earlier discussion). With it, the
following works:

CREATE OR REPLACE FUNCTION array_push (anyarray, anyelement)
RETURNS anyarray
AS '$libdir/plr','array_push'
LANGUAGE 'C';

regression=# select f1[2] from (select array_push('{1,2}'::integer[],3)
as f1) as t;
f1
----
2
(1 row)

Does it make sense to commit this now, or should it wait for the other
changes described below?

The following are my proposed next phases in array support changes.
Please let me know now if you think any of these won't fly (conceptually):

1) Implement SQL99/200x ARRAY[] changes proposed here:
http://archives.postgresql.org/pgsql-hackers/2003-03/msg00297.php
as modified by
http://archives.postgresql.org/pgsql-hackers/2003-03/msg00315.php

2) Implement the following new builtin functions
array_push(anyarray, anyelement) returns anyarray
array_pop(anyarray) returns anyelement
array_subscript(anyarray, int) yields anyelement
singleton_array(anyelement) returns anyarray
- any reason not to call this one simply "array"?
split(text, text) returns text[]
- split string into array on delimiter
implode(text[], text) returns text
- join array elements into a string using given string delimiter

3) Modify contrib/array functions as needed and move to the backend. Or
possibly write equivalent functions from scratch -- I just noticed
this in contrib/array:

* This software is distributed under the GNU General Public License
* either version 2, or (at your option) any later version.

Is anyone still in contact with Massimo Dal Zotto? Any chance he
would change the license to BSD?

4) Update "User's Guide"->"Data Types"->"Arrays" documentation and
create a new section: "User's Guide"->
"Functions and Operators"->
"Array Functions and Operators"

Thoughts, comments, objections all welcomed.

Thanks,

Joe

Attachments:

array-gen.3.patchtext/plain; name=array-gen.3.patchDownload+250-44
#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#15)
Re: SQL99 ARRAY support proposal

Joe Conway <mail@joeconway.com> writes:

+  * If ANY, ANYARRAY, or ANYELEMENT is used for a function's arguments or
+  * return type, make sure the runtime types are consistent with
+  * each other. The argument consistency rules are like so:
+  *
+  * 1) All arguments declared ANY should have matching datatypes.
+  * 2) All arguments declared ANYARRAY should have matching datatypes.
+  * 3) All arguments declared ANYELEMENT should have matching datatypes.
+  * 4) If there are arguments of both ANYELEMENT and ANYARRAY, make sure
+  *    the runtime scalar argument type is in fact the element type for
+  *    the runtime array argument type.

Hmm. I don't see why we should drag ANY into this --- it should just be
a no-constraints placeholder, same as before. What's the gain from
constraining it that you don't get from ANYELEMENT?

+  * 5) If return type is ANYARRAY, no argument is ANYARRAY or ANYELEMENT,
+  *    leave the return type as is.
+  *    XXX should this case be rejected at the point of function creation?

Probably. This case could be handled just as well by declaring the
output to be ANY, I'd think.

+  * 8) If return type is ANYELEMENT, no argument is ANYARRAY or ANYELEMENT
+  *    leave the return type as is.
+  *    XXX should this case be rejected at the point of function creation?

Likewise. The point of (this reinterpretation of) ANYARRAY and
ANYELEMENT is to let the parser deduce the actual output type.
If it's not going to be able to deduce anything, use ANY instead.

regards, tom lane

#17Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#16)
Re: SQL99 ARRAY support proposal

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

+  * If ANY, ANYARRAY, or ANYELEMENT is used for a function's arguments or
+  * return type, make sure the runtime types are consistent with
+  * each other. The argument consistency rules are like so:

Hmm. I don't see why we should drag ANY into this --- it should just be
a no-constraints placeholder, same as before. What's the gain from
constraining it that you don't get from ANYELEMENT?

I was thinking of the case
create function foo("any") returns "any"
but I guess you're right, it can just as easily be
create function foo(anyelement) returns anyelement

I'll pull the ANY stuff out.

+  * 5) If return type is ANYARRAY, no argument is ANYARRAY or ANYELEMENT,
+  *    leave the return type as is.
+  *    XXX should this case be rejected at the point of function creation?

Probably. This case could be handled just as well by declaring the
output to be ANY, I'd think.

+  * 8) If return type is ANYELEMENT, no argument is ANYARRAY or ANYELEMENT
+  *    leave the return type as is.
+  *    XXX should this case be rejected at the point of function creation?

Likewise. The point of (this reinterpretation of) ANYARRAY and
ANYELEMENT is to let the parser deduce the actual output type.
If it's not going to be able to deduce anything, use ANY instead.

OK -- I'll take care of that too.

Thanks,

Joe

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Joe Conway (#15)
Re: SQL99 ARRAY support proposal

Joe Conway writes:

2) Implement the following new builtin functions
array_push(anyarray, anyelement) returns anyarray
array_pop(anyarray) returns anyelement
array_subscript(anyarray, int) yields anyelement
singleton_array(anyelement) returns anyarray
- any reason not to call this one simply "array"?
split(text, text) returns text[]
- split string into array on delimiter
implode(text[], text) returns text
- join array elements into a string using given string delimiter

I think this goes too far. It is just an invitation to people to create
bad database designs by using arrays as lists. Create an array support
package on gborg if you like, but I feel this should not be in the
mainline.

--
Peter Eisentraut peter_e@gmx.net

#19Joe Conway
mail@joeconway.com
In reply to: Peter Eisentraut (#18)
Re: SQL99 ARRAY support proposal

Peter Eisentraut wrote:

Joe Conway writes:

2) Implement the following new builtin functions
array_push(anyarray, anyelement) returns anyarray
array_pop(anyarray) returns anyelement
array_subscript(anyarray, int) yields anyelement
singleton_array(anyelement) returns anyarray
- any reason not to call this one simply "array"?
split(text, text) returns text[]
- split string into array on delimiter
implode(text[], text) returns text
- join array elements into a string using given string delimiter

I think this goes too far. It is just an invitation to people to create
bad database designs by using arrays as lists. Create an array support
package on gborg if you like, but I feel this should not be in the
mainline.

Sorry, I don't agree with that assessment. There are lots of ways people
can create bad database designs using the CREATE TABLE statement too ;-)

Arrays *do* have a place, and they are supported in SQL99+. We get
complaints almost daily regarding the poor array support -- this is just
a step toward improving that. If you want to debate the merit of
individual functions on that list, let's do that, but to dismiss them
all with a wave-of-the-hand is too far.

Joe

#20Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#16)
Re: [HACKERS] SQL99 ARRAY support proposal

Tom Lane wrote:

Hmm. I don't see why we should drag ANY into this --- it should just be
a no-constraints placeholder, same as before. What's the gain from
constraining it that you don't get from ANYELEMENT?

[...snip...]

XXX should this case be rejected at the point of function creation?

Probably. This case could be handled just as well by declaring the
output to be ANY, I'd think.

[...snip...]

Likewise. The point of (this reinterpretation of) ANYARRAY and
ANYELEMENT is to let the parser deduce the actual output type.
If it's not going to be able to deduce anything, use ANY instead.

Here's a new patch with the above corrections. I'm sending it to patches
in hopes it can be applied now rather than waiting. I think it stands
alone (shy some documentation, but I'm good for that ;-)) and makes
sense regardless of the other array support issues.

Thanks,

Joe

Attachments:

array-gen.5.patchtext/plain; name=array-gen.5.patchDownload+248-44
#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#19)
#22Peter Eisentraut
peter_e@gmx.net
In reply to: Joe Conway (#19)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#22)
#24Joe Conway
mail@joeconway.com
In reply to: Peter Eisentraut (#22)
#25johnnnnnn
john@phaedrusdeinus.org
In reply to: Joe Conway (#24)
#26Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: johnnnnnn (#25)
#27Joe Conway
mail@joeconway.com
In reply to: johnnnnnn (#25)
#28Joe Conway
mail@joeconway.com
In reply to: Alvaro Herrera (#26)
#29Rod Taylor
rbt@rbt.ca
In reply to: Joe Conway (#24)
#30Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#23)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#27)
#33Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#32)
#34Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#33)
#35Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Peter Eisentraut (#22)
#36scott.marlowe
scott.marlowe@ihs.com
In reply to: Bruce Momjian (#34)
#37Bruce Momjian
bruce@momjian.us
In reply to: scott.marlowe (#36)
#38scott.marlowe
scott.marlowe@ihs.com
In reply to: Bruce Momjian (#37)
#39Chris Browne
cbbrowne@acm.org
In reply to: Bruce Momjian (#37)
In reply to: Bruce Momjian (#37)
#41scott.marlowe
scott.marlowe@ihs.com
In reply to: Chris Browne (#39)
#42scott.marlowe
scott.marlowe@ihs.com
In reply to: Þórhallur Hálfdánarson (#40)
#43D'Arcy J.M. Cain
darcy@druid.net
In reply to: scott.marlowe (#41)
#44Andrew Dunstan
andrew@dunslane.net
In reply to: scott.marlowe (#42)
#45scott.marlowe
scott.marlowe@ihs.com
In reply to: D'Arcy J.M. Cain (#43)
#46Jason Earl
jason.earl@simplot.com
In reply to: scott.marlowe (#42)
#47scott.marlowe
scott.marlowe@ihs.com
In reply to: Jason Earl (#46)
#48Jason Earl
jason.earl@simplot.com
In reply to: scott.marlowe (#47)
#49Joe Conway
mail@joeconway.com
In reply to: Jason Earl (#48)
#50Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#20)
#51Joe Conway
mail@joeconway.com
In reply to: Bruce Momjian (#50)
#52Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#20)
#53Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#51)
#54Kevin Brown
kevin@sysexperts.com
In reply to: Joe Conway (#49)
#55scott.marlowe
scott.marlowe@ihs.com
In reply to: Kevin Brown (#54)
#56Joe Conway
mail@joeconway.com
In reply to: scott.marlowe (#55)