WIP: Range Types

Started by Jeff Davisover 15 years ago40 messageshackers
Jump to latest
#1Jeff Davis
pgsql@j-davis.com

I have been updating my work in progress here:

http://git.postgresql.org/gitweb?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes

Right now, it's not in a reviewable state, but those interested can
glance through the code.

Quick synopsis (for illustration purposes only; don't expect much from
the current code):

CREATE TYPE numrange
AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp);

Much of the previous discussion seemed to focus on two issues:

1. Reconciling discrete ranges (like ranges of integers) and continuous
ranges (like ranges of numeric).

I liked Robert's suggestion here:

http://archives.postgresql.org/message-id/AANLkTiks_x93_k82b4f_ga634wCi0oeb9fTrUrF28EGM@mail.gmail.com

which says that the user can just define a "canonicalize" function that
will take a range as input (or perhaps the logical pieces of a range)
and put it into an appropriate canonical representation. For instance,
int4range_canonical might take (1,4] and turn it into [2,4]. This is
similar to a few other ideas, but Robert's idea seems to require the
least effort by the person defining the range type, because postgresql
can still handle representation.

It doesn't allow for all of the suggested features. In particular, it
would not allow "granules" to be specified for discrete ranges. But on
balance, it seems like this is the most conceptually simple and I think
it satisfies the primary use cases.

2. Representational issues. There are many possibilities here:
a. flags for inclusivity, start, and offset
b. flags for inclusivity, start, and end
c. if it's a discrete range, start and end only might suffice
d. if it's a discrete range, perhaps something involving "granules"

(a) might be interesting, and for some data types might be more compact,
but it introduces a new datatype that is distinct from the range's
subtype: the "difference type" (that is, for timestamps it's
"interval"). This approach seemed reasonable on paper, but it involves a
lot of extra complexity, and introduces some strange assumptions (using
an offset of "1 month" versus "30 days" can't be allowed).

(c) and (d) are rejected because they require different code paths for
discrete and continuous ranges.

I chose (b). This is the simplest. If desired, we could still allow the
user to specify their own serialize/deserialize functions, which can get
most of the benefits of the other ones anyway.

Other issues:

1. I plan to introduce an ANYRANGE type.

2. We need to use the subtype's IO functions, but those may not be
immutable. So, rather than create new IO functions for each range type,
I was thinking that I'd use just three (anyrange_i_in, anyrange_s_in,
and anyrange_v_in), and select the right one at definition time, based
on the subtype's IO functions' volatility. That seems like a bit of a
hack -- any better ideas?

3. Right now I allow user-defined parse/deparse functions to be
specified. In almost all cases, I would think that we want the text
format to be something like:
[ 2010-01-01, 2011-01-01 )
where the brackets denote inclusivity, and the left and right sides can
be optionally double-quoted. Is it even worth having these parse/deparse
functions, or should we just force the "obvious" format?

4. For the GiST penalty function, and perhaps some picksplit algorithms,
it might be nice to know the length of a range, or do some other kinds
of math. It introduces a lot of complexity to try to define math
functions for each subtype, and try to make sure they behave sanely. So
I was thinking that the user might need to specify a function that
converts the subtype into a float that approximates a value's position
in the total order. Any better ideas?

Overall:

I think this is one of the simpler designs. Conceptually, defining new
ranges of different granularity with ease sounds like a great idea --
but it introduces a lot of complexity (and generated a lot of different
opinions), so it was not included in this design. Similarly, I am
leaning away from lots of user-specified options unless there is a real
use case.

Any suggestions or comments welcome.

Regards,
Jeff Davis

#2Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Jeff Davis (#1)
Re: WIP: Range Types

2011/1/4 Jeff Davis <pgsql@j-davis.com>:

I have been updating my work in progress here:

http://git.postgresql.org/gitweb?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes

Right now, it's not in a reviewable state, but those interested can
glance through the code.

Quick synopsis (for illustration purposes only; don't expect much from
the current code):

 CREATE TYPE numrange
   AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp);

I am interested in how you define increment/decrement operation of
range value in discrete types. The window functions and PARTITION also
want to represent RANGE but there's no clear solution.

Sorry if it's already been discussed since I didn't track the threads.

Regards

--
Hitoshi Harada

#3Florian Weimer
fweimer@bfk.de
In reply to: Jeff Davis (#1)
Re: WIP: Range Types

* Jeff Davis:

4. For the GiST penalty function, and perhaps some picksplit algorithms,
it might be nice to know the length of a range, or do some other kinds
of math. It introduces a lot of complexity to try to define math
functions for each subtype, and try to make sure they behave sanely. So
I was thinking that the user might need to specify a function that
converts the subtype into a float that approximates a value's position
in the total order.

Doesn't the eqsel hint already provide this information?

--
Florian Weimer <fweimer@bfk.de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#1)
Re: WIP: Range Types

Jeff Davis <pgsql@j-davis.com> writes:

2. We need to use the subtype's IO functions, but those may not be
immutable. So, rather than create new IO functions for each range type,
I was thinking that I'd use just three (anyrange_i_in, anyrange_s_in,
and anyrange_v_in), and select the right one at definition time, based
on the subtype's IO functions' volatility. That seems like a bit of a
hack -- any better ideas?

You should just do what we do for arrays and records, ie, mark the I/O
functions stable. There is no reason for anyrange to have a more
complicated approach to this than the existing composite-type structures
do. See discussion thread here
http://archives.postgresql.org/pgsql-hackers/2010-07/msg00932.php
and commit here
http://archives.postgresql.org/pgsql-committers/2010-07/msg00307.php

3. Right now I allow user-defined parse/deparse functions to be
specified. In almost all cases, I would think that we want the text
format to be something like:
[ 2010-01-01, 2011-01-01 )
where the brackets denote inclusivity, and the left and right sides can
be optionally double-quoted. Is it even worth having these parse/deparse
functions, or should we just force the "obvious" format?

+1 for forcing a single consistent format.  I compare this to the
Berkeley-era decision to let types specify nondefault array delimiters
--- that was flexibility that didn't help anybody, just resulted in
over-complicated code (or code that would fall over if someone tried
to actually use a delimiter other than comma...)

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#1)
Re: WIP: Range Types

On Tue, Jan 4, 2011 at 2:29 AM, Jeff Davis <pgsql@j-davis.com> wrote:

I liked Robert's suggestion here:

http://archives.postgresql.org/message-id/AANLkTiks_x93_k82b4f_ga634wCi0oeb9fTrUrF28EGM@mail.gmail.com

which says that the user can just define a "canonicalize" function that
will take a range as input (or perhaps the logical pieces of a range)
and put it into an appropriate canonical representation. For instance,
int4range_canonical might take (1,4] and turn it into [2,4]. This is
similar to a few other ideas, but Robert's idea seems to require the
least effort by the person defining the range type, because postgresql
can still handle representation.

It doesn't allow for all of the suggested features. In particular, it
would not allow "granules" to be specified for discrete ranges. But on
balance, it seems like this is the most conceptually simple and I think
it satisfies the primary use cases.

Maybe I'm missing something, but it seems like this approach could
support granules. You just have to define the canonicalize function
in terms of the granule.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#5)
Re: WIP: Range Types

On Tue, 2011-01-04 at 12:21 -0500, Robert Haas wrote:

It doesn't allow for all of the suggested features. In particular, it
would not allow "granules" to be specified for discrete ranges. But on
balance, it seems like this is the most conceptually simple and I think
it satisfies the primary use cases.

Maybe I'm missing something, but it seems like this approach could
support granules. You just have to define the canonicalize function
in terms of the granule.

I meant that it doesn't support them as an explicit, user-visible
concept.

The main drawback here is that only a select group of people will be
defining discrete range types at all, because it would require them to
define a function first. Perhaps that's for the best, because, (as Tom
pointed out) we don't want someone using floats and then specifying a
granule of '0.01'.

While we're talking about it, one question I had is: should the
canonicalize function be:
/* works on the deserialized information right before serialization */
canonical(&flags, &lower_bound, &upper_bound)
or
/* works on the serialized form right after serialization */
range = canonical(range)

I would lean toward the latter because it's simpler on the user (and
allows non-C functions). But perhaps an efficiency argument could be
made for the former because it could avoid one round of
deserialize/reserialize when the representation is not already in
canonical form.

Regards,
Jeff Davis

#7Jeff Davis
pgsql@j-davis.com
In reply to: Hitoshi Harada (#2)
Re: WIP: Range Types

On Tue, 2011-01-04 at 23:04 +0900, Hitoshi Harada wrote:

2011/1/4 Jeff Davis <pgsql@j-davis.com>:

I have been updating my work in progress here:

http://git.postgresql.org/gitweb?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes

Right now, it's not in a reviewable state, but those interested can
glance through the code.

Quick synopsis (for illustration purposes only; don't expect much from
the current code):

CREATE TYPE numrange
AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp);

I am interested in how you define increment/decrement operation of
range value in discrete types. The window functions and PARTITION also
want to represent RANGE but there's no clear solution.

Sorry if it's already been discussed since I didn't track the threads.

The user would specify a "canonical" function like:

CREATE TYPE int4range AS RANGE (SUBTYPE=int4, SUBTYPE_CMP=btint4cmp,
CANONICAL=my_int4range_canonical);

That function would be called when constructing ranges on input or after
a computation, and could change something like (1,4] into [2,4] if you
prefer the latter form.

So the range types would not have increments, decrements, granules, or
knowledge about the "difference" type (e.g. "interval" is the difference
type for timestamp).

What support do you need/want from range types to help with new window
function features?

Also, partitioning might have some use for range types to represent
range partitions. Comments are welcome.

Regards,
Jeff Davis

#8Jeff Davis
pgsql@j-davis.com
In reply to: Florian Weimer (#3)
Re: WIP: Range Types

On Tue, 2011-01-04 at 14:18 +0000, Florian Weimer wrote:

* Jeff Davis:

4. For the GiST penalty function, and perhaps some picksplit algorithms,
it might be nice to know the length of a range, or do some other kinds
of math. It introduces a lot of complexity to try to define math
functions for each subtype, and try to make sure they behave sanely. So
I was thinking that the user might need to specify a function that
converts the subtype into a float that approximates a value's position
in the total order.

Doesn't the eqsel hint already provide this information?

Can you clarify what you mean? I don't know what the "eqsel hint" is.

Regards,
Jeff Davis

#9Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#6)
Re: WIP: Range Types

On Tue, Jan 4, 2011 at 1:18 PM, Jeff Davis <pgsql@j-davis.com> wrote:

On Tue, 2011-01-04 at 12:21 -0500, Robert Haas wrote:

It doesn't allow for all of the suggested features. In particular, it
would not allow "granules" to be specified for discrete ranges. But on
balance, it seems like this is the most conceptually simple and I think
it satisfies the primary use cases.

Maybe I'm missing something, but it seems like this approach could
support granules.  You just have to define the canonicalize function
in terms of the granule.

I meant that it doesn't support them as an explicit, user-visible
concept.

The main drawback here is that only a select group of people will be
defining discrete range types at all, because it would require them to
define a function first. Perhaps that's for the best, because, (as Tom
pointed out) we don't want someone using floats and then specifying a
granule of '0.01'.

While we're talking about it, one question I had is: should the
canonicalize function be:
 /* works on the deserialized information right before serialization */
 canonical(&flags, &lower_bound, &upper_bound)
or
 /* works on the serialized form right after serialization */
 range = canonical(range)

I would lean toward the latter because it's simpler on the user (and
allows non-C functions).

Yeah, me too.

But perhaps an efficiency argument could be
made for the former because it could avoid one round of
deserialize/reserialize when the representation is not already in
canonical form.

I believe this might be an appropriate time to apply Knuth's Law. I'm
not thrilled with the amount of palloc overhead we have in the
backend, but absent some evidence that this case is going to be
particularly significant, I'd be disinclined to contort the interface.
I suspect that if you run oprofile this won't be the bottleneck.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10Josh Berkus
josh@agliodbs.com
In reply to: Jeff Davis (#6)
Re: WIP: Range Types

On 1/4/11 10:18 AM, Jeff Davis wrote:

The main drawback here is that only a select group of people will be
defining discrete range types at all, because it would require them to
define a function first. Perhaps that's for the best, because, (as Tom
pointed out) we don't want someone using floats and then specifying a
granule of '0.01'.

Frankly, I'm still not convinced that *anyone* will really need discrete
range types -- as opposed to continuous range types, which I'm already
using in production ala "temporal". So I'm completely OK with making
discrete range types hard to use, as long as continous range types are
easy to use.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#11Jeff Davis
pgsql@j-davis.com
In reply to: Josh Berkus (#10)
Re: WIP: Range Types

On Tue, 2011-01-04 at 16:45 -0800, Josh Berkus wrote:

On 1/4/11 10:18 AM, Jeff Davis wrote:

The main drawback here is that only a select group of people will be
defining discrete range types at all, because it would require them to
define a function first. Perhaps that's for the best, because, (as Tom
pointed out) we don't want someone using floats and then specifying a
granule of '0.01'.

Frankly, I'm still not convinced that *anyone* will really need discrete
range types

Well, *need* is a standard that can never be met. But with something
like a date range, it's very possible that a discrete version matches
the real-world problem more closely than a continuous one.

If you use only continuous ranges, then be careful to stick with exactly
one convention, or you will likely get wrong results (I think this point
has already been established). That sounds easy, but consider:
* If you want to know whether two ranges are adjacent (a common
requirement), then you need to use "[ )" or "( ]".
* If you need to map a single point into a range, the only thing that
makes sense is "[ ]".
* If your query contains current_date, you'll probably want ranges that
are either in "( ]" or "[ ]" form.
* If you are mixing data sets, they may use different conventions.

You can work around all of these problems by making the query more
complex (and more error-prone). But I wouldn't like to give up on
discrete ranges for types where it really makes sense (dates, IPs,
integers).

Regards,
Jeff Davis

#12Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#11)
Re: WIP: Range Types

On Wed, Jan 5, 2011 at 12:54 AM, Jeff Davis <pgsql@j-davis.com> wrote:

On Tue, 2011-01-04 at 16:45 -0800, Josh Berkus wrote:

On 1/4/11 10:18 AM, Jeff Davis wrote:

The main drawback here is that only a select group of people will be
defining discrete range types at all, because it would require them to
define a function first. Perhaps that's for the best, because, (as Tom
pointed out) we don't want someone using floats and then specifying a
granule of '0.01'.

Frankly, I'm still not convinced that *anyone* will really need discrete
range types

Well, *need* is a standard that can never be met. But with something
like a date range, it's very possible that a discrete version matches
the real-world problem more closely than a continuous one.

If you use only continuous ranges, then be careful to stick with exactly
one convention, or you will likely get wrong results (I think this point
has already been established). That sounds easy, but consider:
 * If you want to know whether two ranges are adjacent (a common
requirement), then you need to use "[ )" or "( ]".
 * If you need to map a single point into a range, the only thing that
makes sense is "[ ]".
 * If your query contains current_date, you'll probably want ranges that
are either in "( ]" or "[ ]" form.
 * If you are mixing data sets, they may use different conventions.

You can work around all of these problems by making the query more
complex (and more error-prone). But I wouldn't like to give up on
discrete ranges for types where it really makes sense (dates, IPs,
integers).

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Jeff Davis (#7)
Re: WIP: Range Types

2011/1/5 Jeff Davis <pgsql@j-davis.com>:

On Tue, 2011-01-04 at 23:04 +0900, Hitoshi Harada wrote:

 CREATE TYPE numrange
   AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp);

I am interested in how you define increment/decrement operation of
range value in discrete types. The window functions and PARTITION also
want to represent RANGE but there's no clear solution.

The user would specify a "canonical" function like:

  CREATE TYPE int4range AS RANGE (SUBTYPE=int4, SUBTYPE_CMP=btint4cmp,
    CANONICAL=my_int4range_canonical);

That function would be called when constructing ranges on input or after
a computation, and could change something like (1,4] into [2,4] if you
prefer the latter form.

So the range types would not have increments, decrements, granules, or
knowledge about the "difference" type (e.g. "interval" is the difference
type for timestamp).

To canonicalize, it might be ok. I wonder if you won't operate on the
range types like extending their bounds or iterate/enum values from
start to end. In such situation, I bet you'll need to know how to walk
values step by step.

What support do you need/want from range types to help with new window
function features?

My argument is here:
http://archives.postgresql.org/message-id/AANLkTimFmQmbzJ5CTXvE_PwT_zmCuHPoet3gaQq6Pvo8@mail.gmail.com

For any type to calculate boundary based on RANGE clause in window
functions, we need some type interface mechanism in the core to know
how to add / subtract values to reach the boundary from the current
value. For example,

SELECT count(*) OVER (ORDER BY n_int RANGE BETWEEN 10 PRECEDING AND 5
FOLLOWING) FROM tbl;

In the standard, the types allowed in RANGE are only int, float, date,
timestamp, etc. but we have various extensible data types as you know
and we couldn't assume '+' / '-' operators tell add /subtract
operation absolutely.

Also, partitioning might have some use for range types to represent
range partitions. Comments are welcome.

I heard about partitioning which may have the same problem in RANGE
clause from Itagaki-san, but have not looked so much yet.

Regards,

--
Hitoshi Harada

#14Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Jeff Davis (#7)
Re: WIP: Range Types

2011/1/5 Jeff Davis <pgsql@j-davis.com>:

On Tue, 2011-01-04 at 23:04 +0900, Hitoshi Harada wrote:

 CREATE TYPE numrange
   AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp);

I am interested in how you define increment/decrement operation of
range value in discrete types. The window functions and PARTITION also
want to represent RANGE but there's no clear solution.

The user would specify a "canonical" function like:

  CREATE TYPE int4range AS RANGE (SUBTYPE=int4, SUBTYPE_CMP=btint4cmp,
    CANONICAL=my_int4range_canonical);

That function would be called when constructing ranges on input or after
a computation, and could change something like (1,4] into [2,4] if you
prefer the latter form.

So the range types would not have increments, decrements, granules, or
knowledge about the "difference" type (e.g. "interval" is the difference
type for timestamp).

To canonicalize, it might be ok. I wonder if you won't operate on the
range types like extending their bounds or iterate/enum values from
start to end. In such situation, I bet you'll need to know how to walk
values step by step.

What support do you need/want from range types to help with new window
function features?

My argument is here:
http://archives.postgresql.org/message-id/AANLkTimFmQmbzJ5CTXvE_PwT_zmCuHPoet3gaQq6Pvo8@mail.gmail.com

For any type to calculate boundary based on RANGE clause in window
functions, we need some type interface mechanism in the core to know
how to add / subtract values to reach the boundary from the current
value. For example,

SELECT count(*) OVER (ORDER BY n_int RANGE BETWEEN 10 PRECEDING AND 5
FOLLOWING) FROM tbl;

In the standard, the types allowed in RANGE are only int, float, date,
timestamp, etc. but we have various extensible data types as you know
and we couldn't assume '+' / '-' operators tell add /subtract
operation absolutely.

Also, partitioning might have some use for range types to represent
range partitions. Comments are welcome.

I heard about partitioning which may have the same problem in RANGE
clause from Itagaki-san, but have not looked so much yet.

Regards,

--
Hitoshi Harada

#15David Fetter
david@fetter.org
In reply to: Hitoshi Harada (#14)
Re: WIP: Range Types

On Thu, Jan 06, 2011 at 02:25:01AM +0900, Hitoshi Harada wrote:

2011/1/5 Jeff Davis <pgsql@j-davis.com>:

On Tue, 2011-01-04 at 23:04 +0900, Hitoshi Harada wrote:

�CREATE TYPE numrange
� �AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp);

I am interested in how you define increment/decrement operation
of range value in discrete types. The window functions and
PARTITION also want to represent RANGE but there's no clear
solution.

The user would specify a "canonical" function like:

� CREATE TYPE int4range AS RANGE (SUBTYPE=int4, SUBTYPE_CMP=btint4cmp,
� � CANONICAL=my_int4range_canonical);

That function would be called when constructing ranges on input or after
a computation, and could change something like (1,4] into [2,4] if you
prefer the latter form.

So the range types would not have increments, decrements, granules, or
knowledge about the "difference" type (e.g. "interval" is the difference
type for timestamp).

To canonicalize, it might be ok. I wonder if you won't operate on
the range types like extending their bounds or iterate/enum values
from start to end. In such situation, I bet you'll need to know how
to walk values step by step.

What support do you need/want from range types to help with new window
function features?

My argument is here:
http://archives.postgresql.org/message-id/AANLkTimFmQmbzJ5CTXvE_PwT_zmCuHPoet3gaQq6Pvo8@mail.gmail.com

For any type to calculate boundary based on RANGE clause in window
functions, we need some type interface mechanism in the core to know
how to add / subtract values to reach the boundary from the current
value. For example,

SELECT count(*) OVER (ORDER BY n_int RANGE BETWEEN 10 PRECEDING AND 5
FOLLOWING) FROM tbl;

I'm not sure I get the connection between this type of range and the
"range types" Jeff is working on. Jeff's work involves a way to
create types which represent ranges over types which have some kind of
ordering, although not necessarily a successor operation.

Had you planned to cast to an integer range in the process of doing
this window?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#16Jeff Davis
pgsql@j-davis.com
In reply to: David Fetter (#15)
Re: WIP: Range Types

On Wed, 2011-01-05 at 10:41 -0800, David Fetter wrote:

On Thu, Jan 06, 2011 at 02:25:01AM +0900, Hitoshi Harada wrote:

For any type to calculate boundary based on RANGE clause in window
functions, we need some type interface mechanism in the core to know
how to add / subtract values to reach the boundary from the current
value. For example,

SELECT count(*) OVER (ORDER BY n_int RANGE BETWEEN 10 PRECEDING AND 5
FOLLOWING) FROM tbl;

I'm not sure I get the connection between this type of range and the
"range types" Jeff is working on. Jeff's work involves a way to
create types which represent ranges over types which have some kind of
ordering, although not necessarily a successor operation.

Had you planned to cast to an integer range in the process of doing
this window?

I don't think Harada-san needs the type infrastructure itself, just the
interface to find the "difference type" (e.g. "interval" from
"timestamp") as well as functions like add and subtract (potentially two
interesting subtract functions). Without knowing which function to call,
there is no way to find the window boundaries given the current row.

The current design for range types doesn't ask for add or subtract.
Although it might be interesting to try to use such an interface for
range types, it introduces a lot of complexity and makes it easier to
cause subtle problems (consider that addition of timestamps and
intervals is not commutative).

Even if add and subtract were associated with a range type, there's no
way to tell which range type to pick given the window function syntax
(multiple range types could be defined over the same subtype).

I think the interface question should be addressed more directly with a
"type interfaces" patch.

Regards,
Jeff Davis

#17Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Jeff Davis (#16)
Re: WIP: Range Types

2011/1/6 Jeff Davis <pgsql@j-davis.com>:

Even if add and subtract were associated with a range type, there's no
way to tell which range type to pick given the window function syntax
(multiple range types could be defined over the same subtype).

I think the interface question should be addressed more directly with a
"type interfaces" patch.

I agree the current canonical approach fits range type's demand, and
I'm inclined that the type interface discussion is another point.
IIRC, Robert Haas originally began to propose the idea of type
interface to get together three of KNN-GIST, range type and window
frame issue. For KNN-GIST, it was committed by extending pg_amop
without considering others and range type will be as well. Not getting
them together might be the answer.

Regards,

--
Hitoshi Harada

#18Jeff Davis
pgsql@j-davis.com
In reply to: Hitoshi Harada (#17)
Re: WIP: Range Types

On Thu, 2011-01-06 at 09:30 +0900, Hitoshi Harada wrote:

Robert Haas originally began to propose the idea of type
interface to get together three of KNN-GIST, range type and window
frame issue. For KNN-GIST, it was committed by extending pg_amop
without considering others and range type will be as well. Not getting
them together might be the answer.

We may end up combining all of these concepts into type interfaces
later. Now that we have multiple potential users of type interfaces, it
will be easier to design type interfaces to work well for all of them.

Regards,
Jeff Davis

#19Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#16)
Re: WIP: Range Types

On Wed, 2011-01-05 at 12:07 -0800, Jeff Davis wrote:

The current design for range types doesn't ask for add or subtract.
Although it might be interesting to try to use such an interface for
range types, it introduces a lot of complexity and makes it easier to
cause subtle problems (consider that addition of timestamps and
intervals is not commutative).

A consequence of this design is that some generic range functions, like
"length" or "distance" would need to rely on the polymorphism of "+" and
"-" to work.

I'm also not sure if a constructor like "range(start, offset) returns
anyrange" could be made to work generically at all, because the start
and offset may be two different types (and a function that takes
ANYELEMENT requires that all ANYELEMENT arguments are the same type).

Does anyone see a problem with that?

Regards,
Jeff Davis

#20Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#19)
Re: WIP: Range Types

On Thu, Jan 6, 2011 at 12:32 PM, Jeff Davis <pgsql@j-davis.com> wrote:

On Wed, 2011-01-05 at 12:07 -0800, Jeff Davis wrote:

The current design for range types doesn't ask for add or subtract.
Although it might be interesting to try to use such an interface for
range types, it introduces a lot of complexity and makes it easier to
cause subtle problems (consider that addition of timestamps and
intervals is not commutative).

A consequence of this design is that some generic range functions, like
"length" or "distance" would need to rely on the polymorphism of "+" and
"-" to work.

I'm also not sure if a constructor like "range(start, offset) returns
anyrange" could be made to work generically at all, because the start
and offset may be two different types (and a function that takes
ANYELEMENT requires that all ANYELEMENT arguments are the same type).

Does anyone see a problem with that?

Seems like you could make people who want that write range(start,
start+offset) instead without too much pain.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#21Florian Weimer
fweimer@bfk.de
In reply to: Jeff Davis (#8)
#22Jeff Davis
pgsql@j-davis.com
In reply to: Florian Weimer (#21)
#23Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#1)
#24Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#23)
#25Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#24)
#26Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#25)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#25)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#26)
#29Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#27)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#29)
#31Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#30)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#31)
#33Martijn van Oosterhout
kleptog@svana.org
In reply to: Robert Haas (#32)
#34Robert Haas
robertmhaas@gmail.com
In reply to: Martijn van Oosterhout (#33)
#35David E. Wheeler
david@kineticode.com
In reply to: Robert Haas (#34)
#36Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#34)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#36)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#37)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#38)
#40Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#39)