Possible marginally-incompatible change to array subscripting

Started by Tom Laneabout 10 years ago11 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I'm reviewing Yury Zhuravlev's patch to allow array slice boundaries to be
omitted, for example "a[4:]" means "the slice extending from element 4 to
the last element of a". It strikes me that there's an improvement we
could easily make for the case where a mixture of slice and non-slice
syntax appears, that is something like "a[3:4][5]". Now, this has always
meant a slice, and the way we've traditionally managed that is to treat
simple subscripts as being the range upper bound with a lower bound of 1;
that is, what this example means is exactly "a[3:4][1:5]".

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

This would make no difference of course for the common case where the
array lower bound is 1, but it seems a lot less arbitrary when it isn't.
So I think we should strongly consider changing it to mean that, even
though it would be non-backwards-compatible in such cases.

Comments?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#1)
Re: Possible marginally-incompatible change to array subscripting

On Tue, Dec 22, 2015 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm reviewing Yury Zhuravlev's patch to allow array slice boundaries to be
omitted, for example "a[4:]" means "the slice extending from element 4 to
the last element of a". It strikes me that there's an improvement we
could easily make for the case where a mixture of slice and non-slice
syntax appears, that is something like "a[3:4][5]". Now, this has always
meant a slice, and the way we've traditionally managed that is to treat
simple subscripts as being the range upper bound with a lower bound of 1;
that is, what this example means is exactly "a[3:4][1:5]".

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

This would make no difference of course for the common case where the
array lower bound is 1, but it seems a lot less arbitrary when it isn't.
So I think we should strongly consider changing it to mean that, even
though it would be non-backwards-compatible in such cases.

Comments?

Gosh, our arrays are strange. I would have expected a[3:4][5] to mean
a[3:4][5:5].

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Yury Zhuravlev
u.zhuravlev@postgrespro.ru
In reply to: Tom Lane (#1)
Re: Possible marginally-incompatible change to array subscripting

This would make no difference of course for the common case where the
array lower bound is 1, but it seems a lot less arbitrary when it isn't.
So I think we should strongly consider changing it to mean that, even
though it would be non-backwards-compatible in such cases.

Comments?

If you break backwards compatibility, it can be done arrays
similar to C/C++/Python/Ruby and other languages style?
I'm sorry to bring up this thread again...

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

[3:4][:5] instead a[3:4][5] at least this is logical. But after what will
result from a[3:4][5]? One element?

Thanks.

--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: Possible marginally-incompatible change to array subscripting

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Dec 22, 2015 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

Gosh, our arrays are strange. I would have expected a[3:4][5] to mean
a[3:4][5:5].

Yeah, probably, now that you mention it ... but that seems like too much
of a compatibility break. Or does anyone want to argue for just doing
that and never mind the compatibility issues? This is a pretty weird
corner case already; there can't be very many people relying on it.

Another point worth realizing is that the implicit insertion of "1:"
happens in the parser, meaning that existing stored views/rules will dump
out with that added and hence aren't going to change meaning no matter
what we decide here.

(BTW, now that I've read the patch a bit further, it actually silently
changed the semantics as I'm suggesting already. We could undo that
without too much extra code, but I feel that we shouldn't. Robert's
idea seems like a plausible alternative, but it would take a nontrivial
amount of code to implement it unless we are willing to double-evaluate
such a subscript.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yury Zhuravlev (#3)
Re: Possible marginally-incompatible change to array subscripting

Yury Zhuravlev <u.zhuravlev@postgrespro.ru> writes:

If you break backwards compatibility, it can be done arrays
similar to C/C++/Python/Ruby and other languages style?
I'm sorry to bring up this thread again...

I am not sure just exactly how incompatible that would be, but surely it
would break enormously more code than what we're discussing here.
So no, I don't think any such proposal has a chance. There are degrees
of incompatibility, and considering a small/narrow one does not mean that
we'd also consider major breakage.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: Possible marginally-incompatible change to array subscripting

On Tue, Dec 22, 2015 at 12:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Dec 22, 2015 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

Gosh, our arrays are strange. I would have expected a[3:4][5] to mean
a[3:4][5:5].

Yeah, probably, now that you mention it ... but that seems like too much
of a compatibility break. Or does anyone want to argue for just doing
that and never mind the compatibility issues? This is a pretty weird
corner case already; there can't be very many people relying on it.

To be honest, I'd be inclined not to change the semantics at all. But
that's just me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Joshua D. Drake
jd@commandprompt.com
In reply to: Robert Haas (#6)
Re: Possible marginally-incompatible change to array subscripting

On 12/22/2015 10:01 AM, Robert Haas wrote:

On Tue, Dec 22, 2015 at 12:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Dec 22, 2015 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

Gosh, our arrays are strange. I would have expected a[3:4][5] to mean
a[3:4][5:5].

Yeah, probably, now that you mention it ... but that seems like too much
of a compatibility break. Or does anyone want to argue for just doing
that and never mind the compatibility issues? This is a pretty weird
corner case already; there can't be very many people relying on it.

To be honest, I'd be inclined not to change the semantics at all. But
that's just me.

I think a sane approach is better than a safe approach.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#2)
Re: Possible marginally-incompatible change to array subscripting

2015-12-22 18:34 GMT+01:00 Robert Haas <robertmhaas@gmail.com>:

On Tue, Dec 22, 2015 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm reviewing Yury Zhuravlev's patch to allow array slice boundaries to

be

omitted, for example "a[4:]" means "the slice extending from element 4 to
the last element of a". It strikes me that there's an improvement we
could easily make for the case where a mixture of slice and non-slice
syntax appears, that is something like "a[3:4][5]". Now, this has always
meant a slice, and the way we've traditionally managed that is to treat
simple subscripts as being the range upper bound with a lower bound of 1;
that is, what this example means is exactly "a[3:4][1:5]".

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

This would make no difference of course for the common case where the
array lower bound is 1, but it seems a lot less arbitrary when it isn't.
So I think we should strongly consider changing it to mean that, even
though it would be non-backwards-compatible in such cases.

Comments?

Gosh, our arrays are strange. I would have expected a[3:4][5] to mean
a[3:4][5:5].

exactly,

Pavel

Show quoted text

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#8)
Re: Possible marginally-incompatible change to array subscripting

Pavel Stehule <pavel.stehule@gmail.com> writes:

2015-12-22 18:34 GMT+01:00 Robert Haas <robertmhaas@gmail.com>:

On Tue, Dec 22, 2015 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

ISTM that if we'd had Yury's code in there from the beginning, what we
would define this as meaning is "a[3:4][:5]", ie the implied range runs
from whatever the array lower bound is up to the specified subscript.

Gosh, our arrays are strange. I would have expected a[3:4][5] to mean
a[3:4][5:5].

exactly,

Since it's not clear that we've got consensus on doing anything
differently, I've adjusted the current patch to preserve the existing
behavior here (and added some regression tests showing that behavior).
If we do decide to change it, it'd be more appropriate to make that
change in a separate commit, anyway.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#5)
Re: Possible marginally-incompatible change to array subscripting

On 12/22/15 12:01 PM, Tom Lane wrote:

Yury Zhuravlev <u.zhuravlev@postgrespro.ru> writes:

If you break backwards compatibility, it can be done arrays
similar to C/C++/Python/Ruby and other languages style?
I'm sorry to bring up this thread again...

I am not sure just exactly how incompatible that would be, but surely it
would break enormously more code than what we're discussing here.
So no, I don't think any such proposal has a chance. There are degrees
of incompatibility, and considering a small/narrow one does not mean that
we'd also consider major breakage.

As I see it, the biggest problem with our arrays is that they can't
decide if they're a simple array (which means >1 dimension is an array
of arrays) or a matrix (all slices in a dimension must be the same
size). They seem to be more like matricies than arrays, but then there's
a bunch of places that completely ignore dimensionality. It would be
nice to standardize them one way or another, but it seems like the
breakage from that would be horrific.

One could theoretically construct a custom "type" that followed more
traditional semantics, but then you'd lose all the syntax... which I
suspect would make any such "type" all but unusable. The other problem
would be having it deal with any other data type, but at least there's
ways you can work around that for the most part.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#10)
Re: Possible marginally-incompatible change to array subscripting

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:

One could theoretically construct a custom "type" that followed more
traditional semantics, but then you'd lose all the syntax... which I
suspect would make any such "type" all but unusable. The other problem
would be having it deal with any other data type, but at least there's
ways you can work around that for the most part.

Yeah. We've speculated a bit about allowing other datatypes to have
access to the subscript syntax, which could be modeled as allowing
'a[b]' to be an overloadable operator. That seems possibly doable if
someone wanted to put time into it. However, that still leaves a
heck of a lot of functionality on the table, such as automatic creation of
array types corresponding to new scalar types, not to mention the parser's
understanding of "anyarray" vs "anyelement" polymorphism. I have no idea
how we might make those things extensible.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers