Timestamp/Interval proposals: Part 2

Started by Josh Berkusover 23 years ago24 messages
#1Josh Berkus
josh@agliodbs.com

Developers,

Here's part to of my proposal to enhance, improve, and fix Timestamp and
Interval in PostgreSQL. Part I is included after Part II in case everyone
has forgotten it.

Please give me feedback on this. My interest is that I develop calendaring
apps based on Postgresql, and the current Timestamp + Interval limitations
and wierdnesses are giving me tsuris. Thus I'm not particularly attached
to the specifics of my proposals, so long as we do *something* to fix the
issues.

Part II

Interval
-----------------------------------

There are a few problems currently with the Interval data type. The biggest
is that the current rules give us no clear path for implementation of a full
set of operators. The SQL92 standard is no help here; its implementation is
unintuitive and extremely limited ... more limited, in fact, than the current
incomplete implementation in PostgreSQL.

Proposal #3: We should support the addition of "whole days".

Description: Interval should support a "Weeks to Days" increment which is
atomic per day, and not as a aggregate of hours.

Reason: Currently, the "days" increment in Interval is treated as "x 24 hours"
and not as whole days. This can cause some confusion when date calculations
break over a DST change; users do *not* expect events to get an hour earlier
or later in the fall or the spring. The current result is that a lot of
users give up on utilizing time zones because they can't deal with the time
shift in calendar applications.

Proposal #4: Create to_char(INTERVAL, 'format string') Function.

Description: We could really use a built-in function that supports output
formatting of Intervals.

Reason: self-evident, I think.

Proposal #5: Two alternate proposals for overhaul of the interval data type.

Description: Interval needs some radical changes to calculations and
operators.

Reason: Currently, it is nearly impossible to conceive of an implementation
for a full set of operators for the interval data type ( + - / * ) because of
the variability of conversions from one interval increment to another. For
example, what exactly should be the result of '3 months' / '4 days'? Here
are two alternatives.

Alternative #1: Treat Interval Increments as Atomic, and Round

If we implemented this, each of the 3 sub-types of Interval (Year to Month,
Week to Day, and Hour to Millesecond per proposal #3) would be treated as
"atomic" and not renderable in terms of smaller increments, in the same way
that integers are not divisible beyond a prime. In fact, rather than
expressing remainders in smaller increments, the modulo ( % ) operator would
be used to express the remainder.

Further, we would need to create a set of casting functions that allows for
the conversion of one interval subtype into another, using rounding by
approximates, such as 1 year = 365 days, 1 month = 30 days, 1 day = 24 hours,
etc. This is not that different from how the DATE data type works. If users
attempt multiplication and division with intervals of different subtypes, an
implicit cast would be made into the subtype of the smallest value.

Finally, multiplication and division by floats would be disallowed and
replaced by multiplication and division by integers. Thus:

'1 month' + '33 days' = '1 month 33 days'
'1 month 33 days'::INTERVAL WEEK TO DAY = '63 days'
'1 month' + '33 days'::INTERVAL YEAR TO MONTH = '2 months'
'5 months' / '2 months' = 2
'5 months' % '2 months' = '1 month'
'5 months' / 2 = '2 months'
'5 months' % 2 = '1 month'
'9 months' / '2 weeks' = '270 days' / '14 days' = 19
'15 hours' * 20 = '300 hours' (not '12 days 12 hours')
etc.

Pros: It's simple and relatively intuitive. This approach also is similar
to the SQL92 spec, which focuses on interval subtypes.
Cons: It requires an annoying implementation of subtypes, which is cumbersome
and difficult to manage when you have mixed intervals (e.g. '4 days 8 hours 9
minutes'). And, with every operation, rounding is being used which can
result in some ghastly inequalities:
'1 year'/12 --> '1 month'::INTERVAL WEEK TO DAY --> '30 days' * 12
--> '360 days' / '1 year' = 0

Alternative #2: Tie Intervals to a Specific Timestamp

This is the most robust interval implementation I can imagine. The basic idea
is this: instead of intervals being an "absolute" value, they would be
rooted in a specific timestamp. For example, rather than:
INTERVAL '45 Days'
We would use:
INTERVAL '2002-03-30 +45 days'
This would allow us to ground our intervals in the real calendar, and any
subtype conversion problems could be eliminated by resorting to the calendar.
We would know, for example, that:
'2002-05-30 +2 months' / '2002-05-30 +2 weeks' = 4.35714...
and even that
'2002-05-30 +2 months' / 14 = '2002-05-30 +4 days 8 hours 34 min 17 sec ...'

For simplicity, users would be allowed to use intervals which did not state a
start date. In this case, the start date would be assumed to be a default
start date, such as '2000-01-01 00:00:00'. Also, start dates could be
assumed from timestamp math:

'2002-07-30' - '2002-05-30' = '2002-07-30 -61 days'

Of course, this does not get us entirely away from subtyping. For example, if
we did arithmatic with disparate dates, increments would have to be applied
per subtype. That is:

'2002-05-30 +61 days' = '2002-05-30 +2 months'
but '2002-05-30 +2 months' + '2002-01-28'
= '2002-01-28 +2 months' < '2002-01-28 +61 days'

Also, interval to interval math would no longer be commutative, becuase we
would need to use the start date of the first interval in the case of
disparate start dates:

'2002-05-30 + 61 days' + '2002-01-28 +59 days'
= '2002-05-30 +120 days' < '2002-05-30 + 4 months'
even though '2002-05-30 + 61 days' = '2002-05-30 + 2 months'
and '2002-01-28 +59 days' = '2002-01-28 +2 months'

Pros: The most accurate interval calculations possible.
Cons: How the heck would we implement it? And *explain* it? And it's pretty
darn far from the SQL92 implementation.

---------------------------------------------
And, a re-hash of Part I:

PROPOSAL FOR ADJUSTMENTS OF POSTGRESQL TIMESTAMP AND INTERVAL HANDLING
Draft 0.2

Timestamp
------------------------------
Proposal #1: TIMESTAMP WITHOUT TIME ZONE as default

Description: Currently, the data type invoked when users select TIMESTAMP is
TIMESTAMP WITH TIME ZONE. We should change this so that TIMESTAMP defaults to
TIMESTAMP WITHOUT TIME ZONE unless WITH TIME ZONE is specificied.

Reason: Handling time zones is tricky and non-intuitive for the beginning
user. TIMESTAMP WITH TIME ZONE should be reserved for DBAs who know what
they're doing.

Resolution: Taken care of in 7.3.

Proposal #2: We need more time zones.

Description: We need to add, or be able to add, many new time zones to
Postgresql. Ideal would be some kind of "create time zone" statement.

Reason: Current included time zones do not cover all real-world time zones,
and the situation is likely to get worse as various governments play with
their calendars. For example, there is no current time zone which would be
appropriate for the state of Arizona, i.e. "Central Standard Time without
Daylight Savings Time".

Further: A CREATE TIME ZONE statement would have the following syntax:
CREATE TIME ZONE GMT_adjustment, abbreviation, uses_DST, DST_starts
(optional),
DST_ends (optional)
This would allow, to some degree, DBA creation of time zones to take into
account local laws and wierdnesses.

Alternative: We can allow users to designate timezones according to GMT
offset and whether or not they support DST. Example "-8:00 DST" for PST/PDT,
and "-7:00 NDS" for the Arizona example above.

--
-Josh Berkus
Techdocs Writer

#2Thomas Lockhart
lockhart@fourpalms.org
In reply to: Josh Berkus (#1)
Re: Timestamp/Interval proposals: Part 2

Please give me feedback on this...
There are a few problems currently with the Interval data type. The biggest
is that the current rules give us no clear path for implementation of a full
set of operators. The SQL92 standard is no help here; its implementation is
unintuitive and extremely limited ... more limited, in fact, than the current
incomplete implementation in PostgreSQL.

Please define "a full set of operators". Or do the subsequent proposals
defining new behaviors and some operations constitute that list?

Proposal #3: We should support the addition of "whole days".
Description: Interval should support a "Weeks to Days" increment which is
atomic per day, and not as a aggregate of hours.
Reason: Currently, the "days" increment in Interval is treated as "x 24 hours"
and not as whole days. This can cause some confusion when date calculations
break over a DST change; users do *not* expect events to get an hour earlier
or later in the fall or the spring. The current result is that a lot of
users give up on utilizing time zones because they can't deal with the time
shift in calendar applications.

You are overstating the problem imho, but there is a problem for some
users. SQL9x avoids the issue by defining *only* constant offsets for
time zones. That doesn't work in the real world :/

We would expand the storage size by at least 4 bytes to accomodate the
"qualitative day" information. Currently takes 12 bytes, and will take
16 or more. We will need to check for overflows during date/time math,
we will need some heuristics for conversions between hours and days
during calculations, and some users will need to cope with the changed
behavior. Operations like math and comparisons will be more expensive
(though may not be a hugely noticable effect).

Proposal #4: Create to_char(INTERVAL, 'format string') Function.
Reason: self-evident, I think.

Oh. Didn't know it wasn't already there.

Proposal #5: Two alternate proposals for overhaul of the interval data type.
Description: Interval needs some radical changes to calculations and
operators.
Reason: Currently, it is nearly impossible to conceive of an implementation
for a full set of operators for the interval data type ( + - / * ) because of
the variability of conversions from one interval increment to another. For
example, what exactly should be the result of '3 months' / '4 days'? Here
are two alternatives.
Alternative #1: Treat Interval Increments as Atomic, and Round

Yuck (imho of course ;)

If we implemented this, each of the 3 sub-types of Interval (Year to Month,
Week to Day, and Hour to Millesecond per proposal #3) would be treated as
"atomic" and not renderable in terms of smaller increments, in the same way
that integers are not divisible beyond a prime. In fact, rather than
expressing remainders in smaller increments, the modulo ( % ) operator would
be used to express the remainder.

Further, we would need to create a set of casting functions that allows for
the conversion of one interval subtype into another, using rounding by
approximates, such as 1 year = 365 days, 1 month = 30 days, 1 day = 24 hours,
etc. This is not that different from how the DATE data type works. If users
attempt multiplication and division with intervals of different subtypes, an
implicit cast would be made into the subtype of the smallest value.

Finally, multiplication and division by floats would be disallowed and
replaced by multiplication and division by integers. Thus:

Overly restrictive I think. There *is* a use for maintaining precision
during math operations, though apparently not for your use cases.

'1 month' + '33 days' = '1 month 33 days'
'1 month 33 days'::INTERVAL WEEK TO DAY = '63 days'
'1 month' + '33 days'::INTERVAL YEAR TO MONTH = '2 months'
'5 months' / '2 months' = 2
'5 months' % '2 months' = '1 month'
'5 months' / 2 = '2 months'
'5 months' % 2 = '1 month'
'9 months' / '2 weeks' = '270 days' / '14 days' = 19
'15 hours' * 20 = '300 hours' (not '12 days 12 hours')
etc.

Pros: It's simple and relatively intuitive. This approach also is similar
to the SQL92 spec, which focuses on interval subtypes.
Cons: It requires an annoying implementation of subtypes, which is cumbersome
and difficult to manage when you have mixed intervals (e.g. '4 days 8 hours 9
minutes'). And, with every operation, rounding is being used which can
result in some ghastly inequalities:
'1 year'/12 --> '1 month'::INTERVAL WEEK TO DAY --> '30 days' * 12
--> '360 days' / '1 year' = 0

Alternative #2: Tie Intervals to a Specific Timestamp

Double yuck. You already have this capability by your choice of schema;
intervals are intervals and timestamps are timestamps. The behaviors you
discuss above (both current and possible) handle this.

---------------------------------------------
And, a re-hash of Part I:

PROPOSAL FOR ADJUSTMENTS OF POSTGRESQL TIMESTAMP AND INTERVAL HANDLING
Draft 0.2
Proposal #2: We need more time zones.
Description: We need to add, or be able to add, many new time zones to
Postgresql. Ideal would be some kind of "create time zone" statement.
Reason: Current included time zones do not cover all real-world time zones,
and the situation is likely to get worse as various governments play with
their calendars. For example, there is no current time zone which would be
appropriate for the state of Arizona, i.e. "Central Standard Time without
Daylight Savings Time".

Bad example, and I'm not following your argument here. PostgreSQL
supports *many* time zones (Peter E. has said "too many") and any change
for the Arizona example will be at odds with how dates and times are
expected to be handled in, uh, Arizona. They use Mountain Standard Time
(MST), except for years when they didn't, and are covered by specifying
"MST" on input and "SET TIME ZONE 'America/Phoenix'" (and perhaps others
too; it seems that "MST6" gives me consistant behavior on my Linux box).

Further: A CREATE TIME ZONE statement would have the following syntax:
CREATE TIME ZONE GMT_adjustment, abbreviation, uses_DST, DST_starts
(optional),
DST_ends (optional)
This would allow, to some degree, DBA creation of time zones to take into
account local laws and wierdnesses.
Alternative: We can allow users to designate timezones according to GMT
offset and whether or not they support DST. Example "-8:00 DST" for PST/PDT,
and "-7:00 NDS" for the Arizona example above.

I can't imagine that you are not finding a workable solution with the
current capabilities. That said, we are considering adopting the
historic zinc package to support time zones within PostgreSQL (sounds
like you might be doing some of the development ;). And for time zone
lookup (not supported in the zinc API) it *would* be nice to move to a
DBMS table-based implementation, rather than the hardcoded tables we
have now. They may have been good enough for the last 12 years, but
certainly lookup stuff seems like it should be in a database table, eh?

- Thomas

#3Josh Berkus
josh@agliodbs.com
In reply to: Thomas Lockhart (#2)
Re: Timestamp/Interval proposals: Part 2

Thomas,

Please define "a full set of operators". Or do the subsequent
proposals
defining new behaviors and some operations constitute that list?

+ - / * < > = and, if appropriate, %
Where support is lacking is * and /

Don't get me wrong. PostgreSQL has the best implementation of
date/time/interval handling in any database I use. It's just that
there are a few limitations and wierdnesses left, and I'd really like
to see them ironed out so that we can call our impelmentation "near
perfect". Also, so I can stop coding workarounds into my database
apps.

You are overstating the problem imho, but there is a problem for some
users. SQL9x avoids the issue by defining *only* constant offsets for
time zones. That doesn't work in the real world :/

We would expand the storage size by at least 4 bytes to accomodate
the
"qualitative day" information. Currently takes 12 bytes, and will
take
16 or more. We will need to check for overflows during date/time
math,
we will need some heuristics for conversions between hours and days
during calculations, and some users will need to cope with the
changed
behavior. Operations like math and comparisons will be more expensive
(though may not be a hugely noticable effect).

I can see why you've put off doing it. At a basic level, though,
current behaviour is counter-intuitive, so we'll need to do it someday.

Oh. Didn't know it wasn't already there.

Not in 7.2.1. And if you don't know about it, probably not in 7.3
either.

Alternative #1: Treat Interval Increments as Atomic, and Round

Yuck (imho of course ;)

Hey, I did ask for an opinion. <grin>

Alternative #2: Tie Intervals to a Specific Timestamp

Double yuck. You already have this capability by your choice of
schema;
intervals are intervals and timestamps are timestamps. The behaviors
you
discuss above (both current and possible) handle this.

Hmmm? How much is '1 month' / '4 days' then?

The current implementation does not support the / and * operators; that
is, they are supported for some type combos, but not for others, and
the results are inconsistent and sometimes confusing.

Bad example, and I'm not following your argument here. PostgreSQL
supports *many* time zones (Peter E. has said "too many") and any
change
for the Arizona example will be at odds with how dates and times are
expected to be handled in, uh, Arizona. They use Mountain Standard
Time
(MST), except for years when they didn't, and are covered by
specifying
"MST" on input and "SET TIME ZONE 'America/Phoenix'" (and perhaps
others
too; it seems that "MST6" gives me consistant behavior on my Linux
box).

Actually, the real problems I have encountered with time zones would be
solved mostly by adding the 'WEEKS TO DAYS' subtype above. Currently
I'm forced to use TIMESTAMP WITHOUT TIMEZONE in order to avoid the
wierd one-hour shifts in my calendaring app.

I can't imagine that you are not finding a workable solution with the
current capabilities. That said, we are considering adopting the
historic zinc package to support time zones within PostgreSQL (sounds
like you might be doing some of the development ;). And for time zone
lookup (not supported in the zinc API) it *would* be nice to move to
a
DBMS table-based implementation, rather than the hardcoded tables we
have now. They may have been good enough for the last 12 years, but
certainly lookup stuff seems like it should be in a database table,
eh?

Yeah. I'd love to have somebody explain this to me. I noticed when
zinc was mentioned, but I don't know *what* it is. Care to send me a
link?

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh@agliodbs.com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#3)
Re: Timestamp/Interval proposals: Part 2

"Josh Berkus" <josh@agliodbs.com> writes:

Yeah. I'd love to have somebody explain this to me. I noticed when
zinc was mentioned, but I don't know *what* it is. Care to send me a
link?

I think http://www.twinsun.com/tz/tz-link.htm is the underlying timezone
database that Thomas is referring to. I can't find anything named zinc
that seems relevant.

I'm not as excited about sticking the info into Postgres tables as
Thomas seems to be. I think that's (a) unnecessary and (b) likely to
create severe startup problems, since the postmaster needs access to
timezone info to interpret the TZ environment variable, but it can't
read the database. It seems to me that a precalculated timezone table
is plenty good enough.

regards, tom lane

#5Karel Zak
zakkr@zf.jcu.cz
In reply to: Thomas Lockhart (#2)
Re: Timestamp/Interval proposals: Part 2

On Fri, Jun 07, 2002 at 06:48:31PM -0700, Thomas Lockhart wrote:

Proposal #4: Create to_char(INTERVAL, 'format string') Function.
Reason: self-evident, I think.

Oh. Didn't know it wasn't already there.

I'm _sure_ that to_char() is there for interval.

testt=# select to_char('33s 15h 10m 5month'::interval, 'HH:MI:SS Month');
to_char
--------------------
03:10:33 May
(1 row)

test=# select version();
version
-------------------------------------------------------------
PostgreSQL 7.2 on i686-pc-linux-gnu, compiled by GCC 2.95.4
(1 row)

And it's in the docs too....

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#6Karel Zak
zakkr@zf.jcu.cz
In reply to: Josh Berkus (#1)
Re: Timestamp/Interval proposals: Part 2

I'm _sure_ that to_char() is there for interval.

testt=# select to_char('33s 15h 10m 5month'::interval, 'HH:MI:SS Month');
to_char
--------------------
03:10:33 May
(1 row)

Does "May" make sense for an _interval _ ? (Feb 22 + May = Jul 22)?

Would not "5 months" make more sense ?

to_char() convert interval to 'tm' and make output like this struct,
I don't know what other is possible do with it.

Or is it some ISO standard ?

Ditto for 15h -> 03 .

HH vs. HH24

test=# select to_char('33s 15h 10m 5months'::interval, 'HH24:MI:SS Month');
to_char
--------------------
15:10:33 May

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#7Hannu Krosing
hannu@tm.ee
In reply to: Karel Zak (#5)
Re: Timestamp/Interval proposals: Part 2

On Mon, 2002-06-10 at 09:58, Karel Zak wrote:

On Fri, Jun 07, 2002 at 06:48:31PM -0700, Thomas Lockhart wrote:

Proposal #4: Create to_char(INTERVAL, 'format string') Function.
Reason: self-evident, I think.

Oh. Didn't know it wasn't already there.

I'm _sure_ that to_char() is there for interval.

testt=# select to_char('33s 15h 10m 5month'::interval, 'HH:MI:SS Month');
to_char
--------------------
03:10:33 May
(1 row)

Does "May" make sense for an _interval _ ? (Feb 22 + May = Jul 22)?

Would not "5 months" make more sense ?

Or is it some ISO standard ?

Ditto for 15h -> 03 .

--------------------
Hannu

#8Karel Zak
zakkr@zf.jcu.cz
In reply to: Josh Berkus (#1)
Re: Timestamp/Interval proposals: Part 2

On Mon, Jun 10, 2002 at 04:26:47PM +0200, Hannu Krosing wrote:

to_char() convert interval to 'tm' and make output like this struct,

My point is that to_char-ing intervals by converting them to dates is
non-intuitive.

It is really confusing to say that an interval of 5 months = "May"
and 15months == "1 March" ;(

I don't know what other is possible do with it.

perhaps show them with the precision specified and keep data for bigger
units in biggest specified unit.

to_char('2years 1min 4sec'::interval, 'MM SS'); ==> '24mon 64sec'
to_char('2years 1min 4sec'::interval, 'MM MI SS'); ==> '24mon 1min 4sec'

Hmmm, but it's really out of to_char(). For example 'MM' is defined
as number in range 1..12.

The to_char() convert date/time data to string and not to better formatted
interval. The right name for your request is to_interval().

TODO?

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#9Hannu Krosing
hannu@tm.ee
In reply to: Karel Zak (#6)
Re: Timestamp/Interval proposals: Part 2

On Mon, 2002-06-10 at 10:49, Karel Zak wrote:

I'm _sure_ that to_char() is there for interval.

testt=# select to_char('33s 15h 10m 5month'::interval, 'HH:MI:SS Month');
to_char
--------------------
03:10:33 May
(1 row)

Does "May" make sense for an _interval _ ? (Feb 22 + May = Jul 22)?

Would not "5 months" make more sense ?

to_char() convert interval to 'tm' and make output like this struct,

My point is that to_char-ing intervals by converting them to dates is
non-intuitive.

It is really confusing to say that an interval of 5 months = "May"
and 15months == "1 March" ;(

I don't know what other is possible do with it.

perhaps show them with the precision specified and keep data for bigger
units in biggest specified unit.

to_char('2years 1min 4sec'::interval, 'MM SS'); ==> '24mon 64sec'
to_char('2years 1min 4sec'::interval, 'MM MI SS'); ==> '24mon 1min 4sec'

Or is it some ISO standard ?

Does anyone know what standard says about interval formats?

------------
annu

#10Hannu Krosing
hannu@tm.ee
In reply to: Karel Zak (#8)
Re: Timestamp/Interval proposals: Part 2

On Mon, 2002-06-10 at 15:43, Karel Zak wrote:

On Mon, Jun 10, 2002 at 04:26:47PM +0200, Hannu Krosing wrote:

to_char() convert interval to 'tm' and make output like this struct,

My point is that to_char-ing intervals by converting them to dates is
non-intuitive.

It is really confusing to say that an interval of 5 months = "May"
and 15months == "1 March" ;(

I don't know what other is possible do with it.

perhaps show them with the precision specified and keep data for bigger
units in biggest specified unit.

to_char('2years 1min 4sec'::interval, 'MM SS'); ==> '24mon 64sec'
to_char('2years 1min 4sec'::interval, 'MM MI SS'); ==> '24mon 1min 4sec'

Hmmm, but it's really out of to_char(). For example 'MM' is defined
as number in range 1..12.

The to_char() convert date/time data to string and not to better formatted
interval. The right name for your request is to_interval().

if there were a to_interval() then it should convert char data to
interval, like to_date(), to_number() and to_timestamp() do

actually we currently have to_char(x,t) functions for formatting the
following input types, where the second arg is always the format - and
they do take different format strings for different types (i.e. we dont
convert int or double to timestamp and then format that)

to_char | bigint, text
to_char | double precision, text
to_char | integer, text
to_char | interval, text
to_char | numeric, text
to_char | real, text
to_char | timestamp with time zone, text
to_char | timestamp without time zone, text

if our current implementation just converts interval to date it is
surely wrong, at least because the year will be 0000 which does not
exist (AFAIK, the year before 0001 was -0001)

hannu=# select to_char('33s 15h 10m 5months'::interval, 'YYYY.MM.DD
HH24:MI:SS');
to_char
---------------------
0000.05.00 15:10:33
(1 row)

IMHO there should be INTERVAL-specific format characters - calling
5-month period "a May" is stupid (calling 1-month period "a January" is
even stupider :)

If folks want to convert interval to datetime they can always do it by
adding an interval to some base date - doing it automatically by adding
it to non-existing base date 000-00-00 will confuse people

and it is not supported in "plain" postgresql

hannu=# select ('33s 15h 10m 5months'::interval::timestamp);
ERROR: Cannot cast type 'interval' to 'timestamp with time zone'

TODO?

having strictly defined to_interval would be nice, but I think this
would be _another_ todo :)

--------------------------------
Hannu

#11Josh Berkus
josh@agliodbs.com
In reply to: Hannu Krosing (#10)
Re: Timestamp/Interval proposals: Part 2

Karel, Hannu,

To be perfectly honest, I was looking at my 7.1 documentation (courtesy
of DOSSIER) and hadn't realized that 7.2's implementation had got as
far as a function. I had tried to_char(interval) on 7.2.1, received
what looked like gibberish in return, and assumed that it was
unimplemented.

if there were a to_interval() then it should convert char data to
interval, like to_date(), to_number() and to_timestamp() do

Can we put THAT on the to-do list? I find it highly inconsistent that
the function for creating intervals is "interval". Currently, I deal
with it by creating my own to_interval function in template1.

actually we currently have to_char(x,t) functions for formatting the
following input types, where the second arg is always the format -
and
they do take different format strings for different types (i.e. we
dont
convert int or double to timestamp and then format that)

<snip>

IMHO there should be INTERVAL-specific format characters - calling
5-month period "a May" is stupid (calling 1-month period "a January"
is
even stupider :)

I wholeheartedly agree with Hannu, here. Might I suggest:

M# - Nummber of Months - abbr (Interval)
MM# - Number of Months (interval)
Y# - Number of years - abbr (Interval)
YY# - Number of years (Interval)
D# - Number of Days (interval)
W# - Number of weeks -abbr (interval)
WW# - number of weeks (interval)
HH# - Number of hours (interval)
MI# - Number of minutes (interval)
SS# - Number of seconds (interval)

Thus allowing:

hannu=# select to_char('33s 15h 10m 5months'::interval, 'M# D# HH# MI#
SS#');
�������to_char
---------------------
5 mon 0 days 15 hrs 10 min 33 sec�

or:

hannu=# select to_char('33s 15h 10m 5months'::interval, 'MM# D# HH# MI#
SS#');
�������to_char
---------------------
5 months 0 days 15 hrs 10 min 33 sec�

This needs more polishing, of course, but you can see where I'm going
with it.

-Josh

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh@agliodbs.com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

#12Karel Zak
zakkr@zf.jcu.cz
In reply to: Hannu Krosing (#10)
Re: Timestamp/Interval proposals: Part 2

On Mon, Jun 10, 2002 at 07:18:44PM +0200, Hannu Krosing wrote:

OK, I add to_interval() to may TODO (but it's unsure for 7.3).

hannu=# select to_char('33s 15h 10m 5months'::interval, 'YYYY.MM.DD
HH24:MI:SS');
to_char
---------------------
0000.05.00 15:10:33
(1 row)

I think, we can keep this behaviour for to_char(), the good thing
is that you can formatting interval to strings that seems like
standard time (15:10:33), etc.

The to_interval() will have another (you wanted) behaviour.

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#13Fduch the Pravking
fduch@antar.bryansk.ru
In reply to: Karel Zak (#8)
Re: Timestamp/Interval proposals: Part 2

On Mon, Jun 10, 2002 at 03:43:34PM +0200, Karel Zak wrote:

On Mon, Jun 10, 2002 at 04:26:47PM +0200, Hannu Krosing wr ote:

perhaps show them with the precision specified and keep data for bigger
units in biggest specified unit.

to_char('2years 1min 4sec'::interval, 'MM SS'); ==> '24mon 64sec'
to_char('2years 1min 4sec'::interval, 'MM MI SS'); ==> '24mon 1min 4sec'

Hmmm, but it's really out of to_char(). For example 'MM' is defined
as number in range 1..12.

And 'DD' is defined as in range 1..31...
What if I try to select '100 days'?

fduch=> SELECT to_char('100days'::interval, 'YYYY-MM-DD HH24:MI:SS');
to_char
---------------------
0000-00-10 00:00:00

Even more:
DDD is day of year, but

fduch=> SELECT to_char('100days'::interval, 'YYYY-MM-DDD HH24:MI:SS');
to_char
----------------------
0000-00-069 00:00:00

However, this works fine:
fduch=> SELECT extract(DAY from '100days'::interval);
date_part
-----------
100

fduch=> SELECT version();
version
---------------------------------------------------------------------
PostgreSQL 7.2.1 on i386-portbld-freebsd4.6, compiled by GCC 2.95.3

I think, interval is too different from timestamp,
and to_char(interval) needs another format syntax and logics...

--
Fduch M. Pravking

#14Hannu Krosing
hannu@tm.ee
In reply to: Karel Zak (#12)
Re: Timestamp/Interval proposals: Part 2

On Tue, 2002-06-11 at 09:34, Karel Zak wrote:

On Mon, Jun 10, 2002 at 07:18:44PM +0200, Hannu Krosing wrote:

OK, I add to_interval() to may TODO (but it's unsure for 7.3).

hannu=# select to_char('33s 15h 10m 5months'::interval, 'YYYY.MM.DD
HH24:MI:SS');
to_char
---------------------
0000.05.00 15:10:33
(1 row)

I have not checked the SQL9x standards, but it seems from reading the
following links that Interval in Oracle and MimerSQL is actually 2
distinct types (YEAR-MONTH interval and DAY-HOUR-MINUTE-SECOND interval)
which can't be mixed (it is impossible to know if 1 "month" is 28, 29,
30 or 31 days

http://otn.oracle.com/products/rdb7/htdocs/y2000.htm

http://developer.mimer.com/documentation/Mimer_SQL_Reference_Manual/Syntax_Rules4.html#1113356

I think, we can keep this behaviour for to_char(), the good thing
is that you can formatting interval to strings that seems like
standard time (15:10:33), etc.

But interval _is_ _not_ point-in-time, it is a time_span_ .

It can be either good if it gives the results you want or bad if it does
give wrong results like returning 03:10:33 for the above

I would suggest that a separate to_char function would be written that
would be _specific_to_interval_ datatype - so wheb i do

to_char('33s 15h 10m'::interval, 'SS') I will get the actual length of

interval in seconds, 15*3600+10*60+33 = 54633s and not just the seconds part (33)

whereas to_char('33s 15h 10m'::interval, 'MI SS') would give

15*60+10=910 min 33 sec ('910 33')

-----------------
Hannu

#15Karel Zak
zakkr@zf.jcu.cz
In reply to: Hannu Krosing (#14)
Re: Timestamp/Interval proposals: Part 2

On Tue, Jun 11, 2002 at 11:16:13AM +0200, Hannu Krosing wrote:

On Tue, 2002-06-11 at 09:34, Karel Zak wrote:

I think, we can keep this behaviour for to_char(), the good thing
is that you can formatting interval to strings that seems like
standard time (15:10:33), etc.

But interval _is_ _not_ point-in-time, it is a time_span_ .

It can be either good if it gives the results you want or bad if it does
give wrong results like returning 03:10:33 for the above

I would suggest that a separate to_char function would be written that
would be _specific_to_interval_ datatype - so wheb i do

to_char('33s 15h 10m'::interval, 'SS') I will get the actual length of

interval in seconds, 15*3600+10*60+33 = 54633s and not just the seconds part (33)

whereas to_char('33s 15h 10m'::interval, 'MI SS') would give

15*60+10=910 min 33 sec ('910 33')

Well, If the to_char() for interval will output result that you want,
how can I output '15:10:33'?

For this I want two direffent function or anothers format marks for
to_char() like

to_char('33s 15h 10m'::interval, '#MI #SS');
---
'910 33'

but for "standard" marks (that now works like docs describe :-) will output
MI in 0..59 range.

to_char('33s 15h 10m'::interval, 'MI:SS');
---
'10:33'

IMHO it's acceptable. I don't want close the way for output formatting
in "standard" date/time ranges. We can support _both_ ways. Or not?

Thomas, you are quiet? :-)

Karel

PS. the PostgreSQL converting intervals to "standard" format too:

test=# select '33h 15m'::interval - '10h 2m 3s'::interval ;
?column?
----------
23:12:57
(1 row)

test=# select '45h 15m'::interval - '10h 2m 3s'::interval ;
?column?
----------------
1 day 11:12:57

(hmm.. I unsure if this is really released 7.2, I maybe have
some pre-7.2 version now. Is this 7.2 behaviuor?)

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#16Karel Zak
zakkr@zf.jcu.cz
In reply to: Fduch the Pravking (#13)
Re: Timestamp/Interval proposals: Part 2

On Tue, Jun 11, 2002 at 12:37:09PM +0400, Fduch the Pravking wrote:

And 'DD' is defined as in range 1..31...
What if I try to select '100 days'?

fduch=> SELECT to_char('100days'::interval, 'YYYY-MM-DD HH24:MI:SS');
to_char
---------------------
0000-00-10 00:00:00

I already said it. The to_char() is 'tm' struct interpreter and use
standard internal PG routines for interval to 'tm' conversion. We can
talk about why 100days is converted to '10' days and months aren't
used. I agree this example seems strange. Thomas?

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#17Hannu Krosing
hannu@tm.ee
In reply to: Karel Zak (#16)
Re: Timestamp/Interval proposals: Part 2

On Tue, 2002-06-11 at 11:31, Karel Zak wrote:

On Tue, Jun 11, 2002 at 12:37:09PM +0400, Fduch the Pravking wrote:

And 'DD' is defined as in range 1..31...
What if I try to select '100 days'?

fduch=> SELECT to_char('100days'::interval, 'YYYY-MM-DD HH24:MI:SS');
to_char
---------------------
0000-00-10 00:00:00

I already said it. The to_char() is 'tm' struct interpreter and use
standard internal PG routines for interval to 'tm' conversion.

The point is it should _not_ do that for interval.

It does not convert to 'tm' for other types:

hannu=# select to_char(3.1415927,'0009D9');
to_char
---------
0003.1
(1 row)

also, afaik there is no conversion of interval to datetime in
postgresql:

hannu=# select '25mon37d1s'::interval::timestamp;
ERROR: Cannot cast type 'interval' to 'timestamp with time zone'

We can
talk about why 100days is converted to '10' days and months aren't
used. I agree this example seems strange. Thomas?

You can't convert days to months as there is no universal month length.

this is the current (correct) behaviour:

hannu=# select '25mon37d1s'::interval;
interval
--------------------------------
2 years 1 mon 37 days 00:00:01
(1 row)

------------------
Hannu

#18Hannu Krosing
hannu@tm.ee
In reply to: Karel Zak (#15)
Re: Timestamp/Interval proposals: Part 2

On Tue, 2002-06-11 at 11:21, Karel Zak wrote:

On Tue, Jun 11, 2002 at 11:16:13AM +0200, Hannu Krosing wrote:

On Tue, 2002-06-11 at 09:34, Karel Zak wrote:

I think, we can keep this behaviour for to_char(), the good thing
is that you can formatting interval to strings that seems like
standard time (15:10:33), etc.

But interval _is_ _not_ point-in-time, it is a time_span_ .

It can be either good if it gives the results you want or bad if it does
give wrong results like returning 03:10:33 for the above

I would suggest that a separate to_char function would be written that
would be _specific_to_interval_ datatype - so wheb i do

to_char('33s 15h 10m'::interval, 'SS') I will get the actual length of

interval in seconds, 15*3600+10*60+33 = 54633s and not just the seconds part (33)

whereas to_char('33s 15h 10m'::interval, 'MI SS') would give

15*60+10=910 min 33 sec ('910 33')

Well, If the to_char() for interval will output result that you want,
how can I output '15:10:33'?

For this I want two direffent function or anothers format marks for
to_char() like

to_char('33s 15h 10m'::interval, '#MI #SS');
---
'910 33'

and it is probably easyer to implement too - no need to first collect
all possible format chars.

but for "standard" marks (that now works like docs describe :-) will output
MI in 0..59 range.

to_char('33s 15h 10m'::interval, 'MI:SS');
---
'10:33'

IMHO it's acceptable. I don't want close the way for output formatting
in "standard" date/time ranges. We can support _both_ ways. Or not?

perhaps we should do as to_char does for floats -- return ### if
argument cant be shown with given format ?

hannu=# select to_char(1000.0,'0000D00') as good,
hannu-# to_char(1000.0, '000D00') as bad;
good | bad
----------+---------
1000.00 | ###.##
(1 row)

no need to change current documented behaviour without good reason

Thomas, you are quiet? :-)

Karel

PS. the PostgreSQL converting intervals to "standard" format too:

test=# select '33h 15m'::interval - '10h 2m 3s'::interval ;
?column?
----------
23:12:57
(1 row)

test=# select '45h 15m'::interval - '10h 2m 3s'::interval ;
?column?
----------------
1 day 11:12:57

(hmm.. I unsure if this is really released 7.2, I maybe have
some pre-7.2 version now. Is this 7.2 behaviuor?)

Yes.

And this is still an interval, not a timestamp:

hannu=# select '4500h 15m'::interval - '10h 2m 3s'::interval ;
?column?
-------------------
187 days 02:12:57
(1 row)

----------------------------------
Hannu

#19Thomas Lockhart
thomas@pgsql.com
In reply to: Josh Berkus (#1)
Re: Timestamp/Interval proposals: Part 2

fduch=> SELECT to_char('100days'::interval, 'YYYY-MM-DD HH24:MI:SS');
---------------------
0000-00-10 00:00:00

I already said it. The to_char() is 'tm' struct interpreter and use
standard internal PG routines for interval to 'tm' conversion. We can
talk about why 100days is converted to '10' days and months aren't
used. I agree this example seems strange. Thomas?

Not sure why 100 is becoming 10, except that the formatting string is
specifying a field width of two characters (right?). And for intervals,
years and months are not interchangable with days so values do not
overflow from days to months fields.

I played around with to_char(interval,text) but don't understand the
behavior either.

- Thomas

#20Thomas Lockhart
thomas@pgsql.com
In reply to: Josh Berkus (#1)
Re: Timestamp/Interval proposals: Part 2

I already said it. The to_char() is 'tm' struct interpreter and use
standard internal PG routines for interval to 'tm' conversion.

The point is it should _not_ do that for interval.

I use the tm structure to hold this structured information. I *think*
that Karel's usage is just what is intended by my support routines,
though I haven't looked at it in quite some time. Let me know if you
want me to look Karel...

- Thomas

#21Karel Zak
zakkr@zf.jcu.cz
In reply to: Thomas Lockhart (#19)
Re: Timestamp/Interval proposals: Part 2

On Tue, Jun 11, 2002 at 06:22:55AM -0700, Thomas Lockhart wrote:

fduch=> SELECT to_char('100days'::interval, 'YYYY-MM-DD HH24:MI:SS');
---------------------
0000-00-10 00:00:00

I already said it. The to_char() is 'tm' struct interpreter and use
standard internal PG routines for interval to 'tm' conversion. We can
talk about why 100days is converted to '10' days and months aren't
used. I agree this example seems strange. Thomas?

Not sure why 100 is becoming 10, except that the formatting string is
specifying a field width of two characters (right?). And for intervals,

Oops. Yes, you are right it's %02d. I forgot it. Sorry :-)

years and months are not interchangable with days so values do not
overflow from days to months fields.

I played around with to_char(interval,text) but don't understand the
behavior either.

OK. And what is wanted behavior?

DD = day
## = error

1) '30h 10m 15s' 'HH MI SS' ---> '06 10 15'
'30h 10m 15s' 'HH MI SS DD' ---> '06 10 15 1'

2) '30h 10m 15s' 'HH MI SS' ---> '30 10 15'
'30h 10m 15s' 'HH MI SS DD' ---> '30 10 15 ##'

3) '30h 10m 15s' 'HH MI SS' ---> '30 10 15'
'30h 10m 15s' 'HH MI SS DD' ---> '06 10 15 1'

4) use both 1) and 2) but with different marks like
'HH' and '#HH' (or other special prefix)

5) '2week' 'DD' ---> '14'

6) '2week' 'HH' ---> '00'

7) '2week' 'HH' ---> '336'

8) '2week' 'DD HH' ---> '14 00'

9) ???

I unsure what is best, Please, mark right outputs or write examples.

-- for all is probably right idea use '####' in output
if input is not possible convert to wanted format (like current
float to_char() behavior).

BTW:

test=# select date_part('hour', '30h 10m 15s'::interval);
date_part
-----------
6

test=# select date_part('day', '30h 10m 15s'::interval);
date_part
-----------
1

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#22Josh Berkus
josh@agliodbs.com
In reply to: Karel Zak (#12)
Re: Timestamp/Interval proposals: Part 2

Karel,

The to_interval() will have another (you wanted) behaviour.

Please, please, please do not use to_interval for text formatting of
intervals. It's very inconsistent with the naming of other conversion
functions, and will confuse the heck out of a lot of users. As well as
messing up my databases, which have to_interval as a replacement for the
problematically named "interval" function.

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh@agliodbs.com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

#23Karel Zak
zakkr@zf.jcu.cz
In reply to: Josh Berkus (#22)
Re: Timestamp/Interval proposals: Part 2

On Tue, Jun 11, 2002 at 09:36:39AM -0700, Josh Berkus wrote:

Karel,

The to_interval() will have another (you wanted) behaviour.

Please, please, please do not use to_interval for text formatting of
intervals. It's very inconsistent with the naming of other conversion
functions, and will confuse the heck out of a lot of users. As well as
messing up my databases, which have to_interval as a replacement for the
problematically named "interval" function.

Yes, agree. It wasn't well-advised.

It will probably to_char() with special 'interval' behaviour or
format marks. But I still don't know how behaviour is right.

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

#24Hannu Krosing
hannu@tm.ee
In reply to: Josh Berkus (#22)
Re: Timestamp/Interval proposals: Part 2

On Tue, 2002-06-11 at 18:36, Josh Berkus wrote:

Karel,

The to_interval() will have another (you wanted) behaviour.

Please, please, please do not use to_interval for text formatting of
intervals.

If he meant what _I_ described then this was exactly that, i.e.
converting (string,format) to interval.

----------------
Hannu