Have I found an interval arithmetic bug?
Or am I misunderstanding something?
Try this. The result of each “select” is shown as the trailing comment on the same line. I added whitespace by hand to line up the fields.
select interval '-1.7 years'; -- -1 years -8 mons
select interval '29.4 months'; -- 2 years 5 mons 12 days
select interval '-1.7 years 29.4 months'; -- 8 mons 12 days << wrong
select interval '29.4 months -1.7 years'; -- 9 mons 12 days
select interval '-1.7 years' + interval '29.4 months'; -- 9 mons 12 days
select interval '29.4 months' + interval '-1.7 years'; -- 9 mons 12 days
As I reason it, the last four “select” statements are all semantically the same. They’re just different syntaxes to add the two intervals the the first two “select” statements use separately. There’s one odd man out. And I reason this one to be wrong. Is there a flaw in my reasoning?
Further… there’s a notable asymmetry. The fractional part of “1.7 years” is 8.4 months. But the fractional part of the months value doesn’t spread further down into days. However, the fractional part of “29.4 months” (12 days) _does_ spread further down into days. What’s the rationale for this asymmetry?
I can’t see that my observations here can be explained by the difference between calendar time and clock time. Here I’m just working with non-metric units like feet and inches. One year is just defined as 12 months. And one month is just defined as 30 days. All that stuff about adding a month to 3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leap years) , and that other stuff about adding one day to 23:00 on the day before the “spring forward” moment taking you to 23:00 on the next day (i.w. when intervals are added to timestamps) is downstream of simply adding two intervals.
On Thu, Apr 1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
Or am I misunderstanding something?
Try this. The result of each “select” is shown as the trailing comment on the
same line. I added whitespace by hand to line up the fields.select interval '-1.7 years'; -- -1 years -8 mons
select interval '29.4 months'; -- 2 years 5 mons 12
daysselect interval '-1.7 years 29.4 months'; -- 8 mons 12
days << wrong
select interval '29.4 months -1.7 years'; -- 9 mons 12
daysselect interval '-1.7 years' + interval '29.4 months'; -- 9 mons 12
days
select interval '29.4 months' + interval '-1.7 years'; -- 9 mons 12
daysAs I reason it, the last four “select” statements are all semantically the
same. They’re just different syntaxes to add the two intervals the the first
two “select” statements use separately. There’s one odd man out. And I reason
this one to be wrong. Is there a flaw in my reasoning?
[Thread moved to hackers.]
Looking at your report, I thought I could easily find the cause, but it
wasn't obvious. What is happening is that when you cast a float to an
integer in C, it rounds toward zero, meaning that 8.4 rounds to 8 and
-8.4 rounds to -8. The big problem here is that -8.4 is rounding in a
positive direction, while 8.4 rounds in a negative direction. See this:
int(int(-8.4) + 29)
21
int(int(29) + -8.4)
20
When you do '-1.7 years' first, it become -8, and then adding 29 yields
21. In the other order, it is 29 - 8.4, which yields 20.6, which
becomes 20. I honestly had never studied this interaction, though you
would think I would have seen it before. One interesting issue is that
it only happens when the truncations happen to values with different
signs --- if they are both positive or negative, it is fine.
The best fix I think is to use rint()/round to round to the closest
integer, not toward zero. The attached patch does this in a few places,
but the code needs more research if we are happy with this approach,
since there are probably other cases. Using rint() does help to produce
more accurate results, thought the regression tests show no change from
this patch.
Further… there’s a notable asymmetry. The fractional part of “1.7 years” is 8.4
months. But the fractional part of the months value doesn’t spread further down
into days. However, the fractional part of “29.4 months” (12 days) _does_
spread further down into days. What’s the rationale for this asymmetry?
Yes, looking at the code, it seems we only spill down to one unit, not
more. I think we need to have a discussion if we want to change that.
I think the idea was that if you specify a non-whole number, you
probably want to spill down one level, but don't want it spilling all
the way to milliseconds, which is certainly possible.
I can’t see that my observations here can be explained by the difference
between calendar time and clock time. Here I’m just working with non-metric
units like feet and inches. One year is just defined as 12 months. And one
month is just defined as 30 days. All that stuff about adding a month to
3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leap years) ,
and that other stuff about adding one day to 23:00 on the day before the
“spring forward” moment taking you to 23:00 on the next day (i.w. when
intervals are added to timestamps) is downstream of simply adding two
intervals.
Ah, seems you have done some research. ;-)
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..a92e19471d 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3307,28 +3307,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3565,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3601,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
Thread moved to hackers, with a patch.
---------------------------------------------------------------------------
On Thu, Apr 1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
Or am I misunderstanding something?
Try this. The result of each “select” is shown as the trailing comment on the
same line. I added whitespace by hand to line up the fields.select interval '-1.7 years'; -- -1 years -8 mons
select interval '29.4 months'; -- 2 years 5 mons 12
daysselect interval '-1.7 years 29.4 months'; -- 8 mons 12
days << wrong
select interval '29.4 months -1.7 years'; -- 9 mons 12
daysselect interval '-1.7 years' + interval '29.4 months'; -- 9 mons 12
days
select interval '29.4 months' + interval '-1.7 years'; -- 9 mons 12
daysAs I reason it, the last four “select” statements are all semantically the
same. They’re just different syntaxes to add the two intervals the the first
two “select” statements use separately. There’s one odd man out. And I reason
this one to be wrong. Is there a flaw in my reasoning?Further… there’s a notable asymmetry. The fractional part of “1.7 years” is 8.4
months. But the fractional part of the months value doesn’t spread further down
into days. However, the fractional part of “29.4 months” (12 days) _does_
spread further down into days. What’s the rationale for this asymmetry?I can’t see that my observations here can be explained by the difference
between calendar time and clock time. Here I’m just working with non-metric
units like feet and inches. One year is just defined as 12 months. And one
month is just defined as 30 days. All that stuff about adding a month to
3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leap years) ,
and that other stuff about adding one day to 23:00 on the day before the
“spring forward” moment taking you to 23:00 on the next day (i.w. when
intervals are added to timestamps) is downstream of simply adding two
intervals.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
bruce@momjian.us wrote:
[Thread moved to hackers.] …The best fix I think is…
Bryn wrote: Further… there’s a notable asymmetry. The fractional part of “1.7 years” is 8.4 months. But the fractional part of the months value doesn’t spread further down into days. However, the fractional part of “29.4 months” (12 days) _does_ spread further down into days. What’s the rationale for this asymmetry?
Yes, looking at the code, it seems we only spill down to one unit, not more. I think we need to have a discussion if we want to change that. I think the idea was that if you specify a non-whole number, you probably want to spill down one level, but don't want it spilling all the way to milliseconds, which is certainly possible.
Thanks for the quick response, Bruce. I was half expecting (re the bug) an explanation that showed that I’d (once again) misunderstood a fundamental principle.
I should come clean about the larger context. I work for Yugabyte, Inc. We have a distributed SQL database that uses the Version 11.2 PostgreSQL C code for SQL processing “as is”.
https://blog.yugabyte.com/distributed-postgresql-on-a-google-spanner-architecture-query-layer/
The founders decided to document YugabyteDB’s SQL functionality explicitly rather than just to point to the published PostgreSQL doc. (There are some DDL differences that reflect the storage layer differences.) I’m presently documenting date-time functionality. This is why I’m so focused on understanding the semantics exactly and on understanding the requirements that the functionality was designed to meet. I’m struggling with interval functionality. I read this:
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
Notice that the doc says that spill-down goes all the way to seconds and not just one unit. This simple test is consistent with the doc (output follows the dash-dash comment):
select ('6.54321 months'::interval)::text as i; -- 6 mons 16 days 07:06:40.32
You see similar spill-down with this:
select ('876.54321 days'::interval)::text as i; -- 876 days 13:02:13.344
And so on down through the remaining smaller units. It’s only this test that doesn’t spill down one unit:
select ('6.54321 years'::interval)::text as i; -- 6 years 6 mons
This does suggest a straight bug rather than a case for committee debate about what might have been intended. What do you think, Bruce?
On Fri, Apr 2, 2021 at 11:06 AM Bruce Momjian <bruce@momjian.us> wrote:
Thread moved to hackers, with a patch.
---------------------------------------------------------------------------
Here is a link to that thread, for others who might be curious about it as
I was:
/messages/by-id/20210402180549.GF9270@momjian.us
I get why it can make sense to move a thread. But if when doing so you
post a link to the new thread, that would be appreciated. Thanks!
Ken
--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client>*
ken.tanzer@agency-software.org
(253) 245-3801
Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.
Bruce:
Thanks for tackling this issue.
The patch looks good to me.
When you have time, can you include the places which were not covered by
the current diff ?
Cheers
On Fri, Apr 2, 2021 at 11:06 AM Bruce Momjian <bruce@momjian.us> wrote:
Show quoted text
On Thu, Apr 1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
Or am I misunderstanding something?
Try this. The result of each “select” is shown as the trailing comment
on the
same line. I added whitespace by hand to line up the fields.
select interval '-1.7 years'; -- -1 years -8
mons
select interval '29.4 months'; -- 2 years 5
mons 12
days
select interval '-1.7 years 29.4 months'; -- 8
mons 12
days << wrong
select interval '29.4 months -1.7 years'; -- 9mons 12
days
select interval '-1.7 years' + interval '29.4 months'; -- 9
mons 12
days
select interval '29.4 months' + interval '-1.7 years'; -- 9mons 12
days
As I reason it, the last four “select” statements are all semantically
the
same. They’re just different syntaxes to add the two intervals the the
first
two “select” statements use separately. There’s one odd man out. And I
reason
this one to be wrong. Is there a flaw in my reasoning?
[Thread moved to hackers.]
Looking at your report, I thought I could easily find the cause, but it
wasn't obvious. What is happening is that when you cast a float to an
integer in C, it rounds toward zero, meaning that 8.4 rounds to 8 and
-8.4 rounds to -8. The big problem here is that -8.4 is rounding in a
positive direction, while 8.4 rounds in a negative direction. See this:int(int(-8.4) + 29)
21
int(int(29) + -8.4)
20When you do '-1.7 years' first, it become -8, and then adding 29 yields
21. In the other order, it is 29 - 8.4, which yields 20.6, which
becomes 20. I honestly had never studied this interaction, though you
would think I would have seen it before. One interesting issue is that
it only happens when the truncations happen to values with different
signs --- if they are both positive or negative, it is fine.The best fix I think is to use rint()/round to round to the closest
integer, not toward zero. The attached patch does this in a few places,
but the code needs more research if we are happy with this approach,
since there are probably other cases. Using rint() does help to produce
more accurate results, thought the regression tests show no change from
this patch.Further… there’s a notable asymmetry. The fractional part of “1.7 years”
is 8.4
months. But the fractional part of the months value doesn’t spread
further down
into days. However, the fractional part of “29.4 months” (12 days) _does_
spread further down into days. What’s the rationale for this asymmetry?Yes, looking at the code, it seems we only spill down to one unit, not
more. I think we need to have a discussion if we want to change that.
I think the idea was that if you specify a non-whole number, you
probably want to spill down one level, but don't want it spilling all
the way to milliseconds, which is certainly possible.I can’t see that my observations here can be explained by the difference
between calendar time and clock time. Here I’m just working withnon-metric
units like feet and inches. One year is just defined as 12 months. And
one
month is just defined as 30 days. All that stuff about adding a month to
3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leapyears) ,
and that other stuff about adding one day to 23:00 on the day before the
“spring forward” moment taking you to 23:00 on the next day (i.w. when
intervals are added to timestamps) is downstream of simply adding two
intervals.Ah, seems you have done some research. ;-)
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Apr 1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
Or am I misunderstanding something?
Try this. The result of each “select” is shown as the trailing comment
on the
same line. I added whitespace by hand to line up the fields.
select interval '-1.7 years'; -- -1 years -8
mons
select interval '29.4 months'; -- 2 years 5
mons 12
days
select interval '-1.7 years 29.4 months'; -- 8
mons 12
days << wrong
select interval '29.4 months -1.7 years'; -- 9mons 12
days
select interval '-1.7 years' + interval '29.4 months'; -- 9
mons 12
days
select interval '29.4 months' + interval '-1.7 years'; -- 9mons 12
days
While maybe there is an argument to fixing the negative/positive rounding
issue - there is no way this gets solved without breaking the current
implementation
select interval '0.3 years' + interval '0.4 years' - interval '0.7 years' +
interval '0.1 years' should not equal 0 but it certainly does.
Unless we take the concept of 0.3 years = 3 months and move to something
along the lines of
1 year = 360 days
1 month = 30 days
so therefore
0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days
0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days
Then, and only if we don't go to any more than tenths of a year, does the
math work. Probably this should resolve down to seconds and then work
backwards - but unless we're looking at breaking the entire way it
currently resolves things - I don't think this is of much value.
Doing math on intervals is like doing math on rounded numbers - there is
always going to be a pile of issues because the level of precision just is
not good enough.
John
On Fri, Apr 2, 2021 at 01:05:42PM -0700, Bryn Llewellyn wrote:
I should come clean about the larger context. I work for Yugabyte, Inc. We have
a distributed SQL database that uses the Version 11.2 PostgreSQL C code for SQL
processing “as is”.https://blog.yugabyte.com/
distributed-postgresql-on-a-google-spanner-architecture-query-layer/The founders decided to document YugabyteDB’s SQL functionality explicitly
rather than just to point to the published PostgreSQL doc. (There are some DDL
differences that reflect the storage layer differences.) I’m presently
documenting date-time functionality. This is why I’m so focused on
understanding the semantics exactly and on understanding the requirements that
the functionality was designed to meet. I’m struggling with interval
functionality. I read this:
[Sorry, also moved this to hackers. I might normally do all the
discussion on general, with patches, and then move it to hackers, but
our PG 14 deadline is next week, so it is best to move it now in hopes
it can be addressed in PG 14.]
Sure, seems like a good idea.
https://www.postgresql.org/docs/current/datatype-datetime.html#
DATATYPE-INTERVAL-INPUT« …field values can have fractional parts; for example '1.5 week' or
'01:02:03.45'. Such input is converted to the appropriate number of months,
days, and seconds for storage. When this would result in a fractional number of
months or days, the fraction is added to the lower-order fields using the
conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5
month' becomes 1 month and 15 days. Only seconds will ever be shown as
fractional on output. »Notice that the doc says that spill-down goes all the way to seconds and not
just one unit. This simple test is consistent with the doc (output follows the
dash-dash comment):select ('6.54321 months'::interval)::text as i; -- 6 mons 16 days 07:06:40.32
You see similar spill-down with this:
select ('876.54321 days'::interval)::text as i; -- 876 days 13:02:13.344
And so on down through the remaining smaller units. It’s only this test that
doesn’t spill down one unit:select ('6.54321 years'::interval)::text as i; -- 6 years 6 mons
This does suggest a straight bug rather than a case for committee debate about
what might have been intended. What do you think, Bruce?
So, that gets into more detail. When I said "spill down one unit", I
was not talking about _visible_ units, but rather the three internal
units used by Postgres:
https://www.postgresql.org/docs/13/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
Internally interval values are stored as months, days, and seconds.
-------------------------
However, while that explains why years don't spill beyond months, it
doesn't explain why months would spill beyond days. This certainly
seems inconsistent.
I have modified the patch to prevent partial months from creating
partial hours/minutes/seconds, so the output is now at least consistent
based on the three units:
SELECT ('6.54321 years'::interval)::text as i;
i
----------------
6 years 7 mons
SELECT ('6.54321 months'::interval)::text as i;
i
----------------
6 mons 16 days
SELECT ('876.54321 days'::interval)::text as i;
i
-----------------------
876 days 13:02:13.344
Partial years keeps it in months, partial months takes it to days, and
partial days take it to hours/minutes/seconds. This seems like an
improvement.
This also changes the regression test output, I think for the better:
SELECT INTERVAL '1.5 weeks';
i
------------------
- 10 days 12:00:00
+ 10 days
The new output is less precise, but probably closer to what the user
wanted.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..d5b3705145 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -526,7 +526,6 @@ AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
extra_days = (int) frac;
tm->tm_mday += extra_days;
frac -= extra_days;
- AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
}
/* Fetch a fractional-second value with suitable error checking */
@@ -3307,28 +3306,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3564,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3600,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 4245016c8e..d1bb00db01 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -529,28 +529,28 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index c5ffa9f2cc..bd35c930b0 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -34,10 +34,10 @@ SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-1 day +02:03:00
(1 row)
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
- Ten days twelve hours
------------------------
- 10 days 12:00:00
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
+ Ten days
+----------
+ 10 days
(1 row)
SELECT INTERVAL '1.5 months' AS "One month 15 days";
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 11c1929bef..bd141e7b52 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -11,7 +11,7 @@ SELECT INTERVAL '+02:00' AS "Two hours";
SELECT INTERVAL '-08:00' AS "Eight hours";
SELECT INTERVAL '-1 +02:03' AS "22 hours ago...";
SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
SELECT INTERVAL '1.5 months' AS "One month 15 days";
SELECT INTERVAL '10 years -11 month -12 days +13:14' AS "9 years...";
On Fri, Apr 2, 2021 at 01:20:26PM -0700, Ken Tanzer wrote:
On Fri, Apr 2, 2021 at 11:06 AM Bruce Momjian <bruce@momjian.us> wrote:
Thread moved to hackers, with a patch.
---------------------------------------------------------------------------Here is a link to that thread, for others who might be curious about it as I
was:
/messages/by-id/20210402180549.GF9270@momjian.us
b3bdafbfeacab0dd8967ff2a3ebf7844I get why it can make sense to move a thread.� But if when doing so you post a
link to the new thread, that would be appreciated.� Thanks!
I didn't think anyone but the original poster, who was copied in the new
thread, would really care about this thread, but it seems I was wrong.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Fri, Apr 2, 2021 at 01:27:33PM -0700, Zhihong Yu wrote:
Bruce:
Thanks for tackling this issue.The patch looks good to me.
When you have time, can you include the places which were not covered by the
current diff ?
I have just posted a new version of the patch which I think covers all
the right areas.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Fri, Apr 2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
While maybe there is an argument to fixing the negative/positive rounding issue
- there is no way this gets solved without breaking the current implementationselect interval '0.3 years' + interval '0.4 years' - interval '0.7 years'�+
interval '0.1 years' should not equal 0 but it certainly does.
My new code returns 0.2 months for this, not zero:
SELECT interval '0.3 years' + interval '0.4 years' -
interval '0.7 years' + interval '0.1 years';
?column?
----------
2 mons
which is also wrong since:
SELECT interval '0.1 years';
interval
----------
1 mon
Unless we take the concept of 0.3 years = 3 months and move to something along
the lines of�1 year = 360 days
1 month = 30 days�so therefore�
0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days�
0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
0.7 years = 360 days * 0.7 = 252 days = 8 months 12 daysThen, and only if we don't go to any more than tenths of a year, does the math
work. Probably this should resolve down to seconds and then work backwards -
but unless we're looking at breaking the entire way it currently resolves
things - I don't think this is of much value.Doing math on intervals is like doing math on rounded numbers - there is always
going to be a pile of issues because the level of precision just is not good
enough.
I think the big question is what units do people want with fractional
values. I have posted a follow-up email that spills only for one unit,
which I think is the best approach.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Hi,
bq. My new code returns 0.2 months for this, not zero
Can you clarify (the output below that was 2 mons, not 0.2) ?
Thanks
On Fri, Apr 2, 2021 at 4:58 PM Bruce Momjian <bruce@momjian.us> wrote:
Show quoted text
On Fri, Apr 2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
While maybe there is an argument to fixing the negative/positiverounding issue
- there is no way this gets solved without breaking the current
implementation
select interval '0.3 years' + interval '0.4 years' - interval '0.7
years' +
interval '0.1 years' should not equal 0 but it certainly does.
My new code returns 0.2 months for this, not zero:
SELECT interval '0.3 years' + interval '0.4 years' -
interval '0.7 years' + interval '0.1 years';
?column?
----------
2 monswhich is also wrong since:
SELECT interval '0.1 years';
interval
----------
1 monUnless we take the concept of 0.3 years = 3 months and move to something
along
the lines of
1 year = 360 days
1 month = 30 daysso therefore
0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days
0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
0.7 years = 360 days * 0.7 = 252 days = 8 months 12 daysThen, and only if we don't go to any more than tenths of a year, does
the math
work. Probably this should resolve down to seconds and then work
backwards -
but unless we're looking at breaking the entire way it currently resolves
things - I don't think this is of much value.Doing math on intervals is like doing math on rounded numbers - there is
always
going to be a pile of issues because the level of precision just is not
good
enough.
I think the big question is what units do people want with fractional
values. I have posted a follow-up email that spills only for one unit,
which I think is the best approach.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
On Fri, Apr 2, 2021 at 07:47:32PM -0400, Bruce Momjian wrote:
I have modified the patch to prevent partial months from creating
partial hours/minutes/seconds, so the output is now at least consistent
based on the three units:SELECT ('6.54321 years'::interval)::text as i;
i
----------------
6 years 7 monsSELECT ('6.54321 months'::interval)::text as i;
i
----------------
6 mons 16 daysSELECT ('876.54321 days'::interval)::text as i;
i
-----------------------
876 days 13:02:13.344Partial years keeps it in months, partial months takes it to days, and
partial days take it to hours/minutes/seconds. This seems like an
improvement.This also changes the regression test output, I think for the better:
SELECT INTERVAL '1.5 weeks'; i ------------------ - 10 days 12:00:00 + 10 daysThe new output is less precise, but probably closer to what the user
wanted.
Thinking some more about this, the connection between months and days is
very inaccurate, 30 days/month, but the connection between days and
hours/minutes/seconds is pretty accurate, except for leap days.
Therefore, returning "10 days 12:00:00" is in many ways better, but
returning hours/minutes/seconds for fractional months is very arbitrary
and suggests an accuracy that doesn't exist. However, I am afraid that
trying to enforce that distinction in the Postgres behavior would appear
very arbitrary, so what I did above is proabably the best I can do.
Here is another example of what we have:
SELECT INTERVAL '1.5 years';
interval
---------------
1 year 6 mons
SELECT INTERVAL '1.5 months';
interval
---------------
1 mon 15 days
SELECT INTERVAL '1.5 weeks';
interval
----------
10 days
SELECT INTERVAL '1.5 days';
interval
----------------
1 day 12:00:00
SELECT INTERVAL '1.5 hours';
interval
----------
01:30:00
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
bruce@momjian.us wrote:
I have just posted a new version of the patch which I think covers all the right areas.
I found the relevant email from you to pgsql-hackers here:
/messages/by-id/20210402234732.GA29125@momjian.us
You said:
I have modified the patch to prevent partial months from creating partial hours/minutes/seconds… Partial years keeps it in months, partial months takes it to days, and partial days take it to hours/minutes/seconds. This seems like an improvement.
I have written some PL/pgSQL code that faithfully emulates the behavior that I see in my present vanilla PostgreSQL Version 13.2 system in a wide range of tests. This is the key part:
m1 int not null := trunc(p.mo);
m_remainder numeric not null := p.mo - m1::numeric;
m int not null := trunc(p.yy*12) + m1;
d_real numeric not null := p.dd + m_remainder*30.0;
d int not null := floor(d_real);
d_remainder numeric not null := d_real - d::numeric;
s numeric not null := d_remainder*24.0*60.0*60.0 +
p.hh*60.0*60.0 +
p.mi*60.0 +
p.ss;
begin
return (m, d, s)::modeled_interval_t;
end;
These quantities:
p.yy, p.mo, p.dd, p.hh, p.mi, and p.ss
are the user’s parameterization. All are real numbers. Because non-integral values for years, months, days, hours, and minutes are allowed when you specify a value using the ::interval typecast, my reference doc must state the rules. I would have struggled to express these rules in prose—especially given the use both of trunc() and floor(). I would have struggled more to explain what requirements these rules meet.
I gather that by the time YugabyteDB has adopted your patch, my PL/pgSQL will no longer be a correct emulation. So I’ll re-write it then.
I intend to advise users always to constrain the values, when they specify an interval value explicitly, so the the years, months, days, hours, and minutes are integers. This is, after all, the discipline that the make_interval() built-in imposes. So I might recommend using only that.
On Fri, Apr 2, 2021 at 05:50:59PM -0700, Bryn Llewellyn wrote:
are the user’s parameterization. All are real numbers. Because non-integral
values for years, months, days, hours, and minutes are allowed when you specify
a value using the ::interval typecast, my reference doc must state the rules. I
would have struggled to express these rules in prose—especially given the use
both of trunc() and floor(). I would have struggled more to explain what
requirements these rules meet.
The fundamental issue is that while months, days, and seconds are
consistent in their own units, when you have to cross from one unit to
another, it is by definition imprecise, since the interval is not tied
to a specific date, with its own days-of-the-month and leap days and
daylight savings time changes. It feels like it is going to be
imprecise no matter what we do.
Adding to this is the fact that interval values are stored in C 'struct
tm' defined in libc's ctime(), where months are integers, so carrying
around non-integer month values until we get a final result would add a
lot of complexity, and complexity to a system that is by definition
imprecise, which doesn't seem worth it.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Hi,
I got a local build with second patch where:
yugabyte=# SELECT interval '0.3 years' + interval '0.4 years' -
interval '0.7 years';
?column?
----------
1 mon
I think the outcome is a bit unintuitive (I would expect result close to 0).
Cheers
On Fri, Apr 2, 2021 at 5:07 PM Zhihong Yu <zyu@yugabyte.com> wrote:
Show quoted text
Hi,
bq. My new code returns 0.2 months for this, not zeroCan you clarify (the output below that was 2 mons, not 0.2) ?
Thanks
On Fri, Apr 2, 2021 at 4:58 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Apr 2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
While maybe there is an argument to fixing the negative/positiverounding issue
- there is no way this gets solved without breaking the current
implementation
select interval '0.3 years' + interval '0.4 years' - interval '0.7
years' +
interval '0.1 years' should not equal 0 but it certainly does.
My new code returns 0.2 months for this, not zero:
SELECT interval '0.3 years' + interval '0.4 years' -
interval '0.7 years' + interval '0.1 years';
?column?
----------
2 monswhich is also wrong since:
SELECT interval '0.1 years';
interval
----------
1 monUnless we take the concept of 0.3 years = 3 months and move to
something along
the lines of
1 year = 360 days
1 month = 30 daysso therefore
0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days
0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
0.7 years = 360 days * 0.7 = 252 days = 8 months 12 daysThen, and only if we don't go to any more than tenths of a year, does
the math
work. Probably this should resolve down to seconds and then work
backwards -
but unless we're looking at breaking the entire way it currently
resolves
things - I don't think this is of much value.
Doing math on intervals is like doing math on rounded numbers - there
is always
going to be a pile of issues because the level of precision just is not
good
enough.
I think the big question is what units do people want with fractional
values. I have posted a follow-up email that spills only for one unit,
which I think is the best approach.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
On Fri, 2 Apr 2021 at 21:08, Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
I got a local build with second patch where:yugabyte=# SELECT interval '0.3 years' + interval '0.4 years' -
interval '0.7 years';
?column?
----------
1 monI think the outcome is a bit unintuitive (I would expect result close to
0).
That's not fundamentally different from this:
odyssey=> select 12 * 3/10 + 12 * 4/10 - 12 * 7/10;
?column?
----------
-1
(1 row)
odyssey=>
And actually the result is pretty close to 0. I mean it’s less than 0.1
year.
I wonder if it might have been better if only integers had been accepted
for the components? If you want 0.3 years write 0.3 * '1 year'::interval.
But changing it now would be a pretty significant backwards compatibility
break.
There's really no avoiding counterintuitive behaviour though. Look at this:
odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval;
?column?
------------------
-1 mons +30 days
(1 row)
odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval = '0';
?column?
----------
t
(1 row)
odyssey=>
In other words, doing the “same” calculation but with multiplying 1 year
intervals by floats to get the values to add, you end up with an interval
that while not identical to 0 does compare equal to 0. So very close to 0;
in fact, as close to 0 as you can get without actually being identically 0.
Hi,
The mix of interval and comparison with float is not easy to interpret. See
the following (I got 0.0833 since the result for interval '0.3 years' +
interval '0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).
yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval = '0.0833 year'::interval;
?column?
----------
f
As long as Bruce's patch makes improvements over the current behavior, I
think that's fine.
Cheers
On Fri, Apr 2, 2021 at 6:24 PM Isaac Morland <isaac.morland@gmail.com>
wrote:
Show quoted text
On Fri, 2 Apr 2021 at 21:08, Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
I got a local build with second patch where:yugabyte=# SELECT interval '0.3 years' + interval '0.4 years' -
interval '0.7 years';
?column?
----------
1 monI think the outcome is a bit unintuitive (I would expect result close to
0).That's not fundamentally different from this:
odyssey=> select 12 * 3/10 + 12 * 4/10 - 12 * 7/10;
?column?
----------
-1
(1 row)odyssey=>
And actually the result is pretty close to 0. I mean it’s less than 0.1
year.I wonder if it might have been better if only integers had been accepted
for the components? If you want 0.3 years write 0.3 * '1 year'::interval.
But changing it now would be a pretty significant backwards compatibility
break.There's really no avoiding counterintuitive behaviour though. Look at this:
odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval;
?column?
------------------
-1 mons +30 days
(1 row)odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval = '0';
?column?
----------
t
(1 row)odyssey=>
In other words, doing the “same” calculation but with multiplying 1 year
intervals by floats to get the values to add, you end up with an interval
that while not identical to 0 does compare equal to 0. So very close to 0;
in fact, as close to 0 as you can get without actually being identically 0.
On Fri, Apr 2, 2021 at 06:11:08PM -0700, Zhihong Yu wrote:
Hi,
I got a local build with second patch where:yugabyte=# SELECT �interval '0.3 years' + interval '0.4 years' -
� � � � � � � � interval '0.7 years';
�?column?
----------
�1 monI think the outcome is a bit unintuitive (I would expect result�close�to 0).
Uh, the current code returns:
SELECT interval '0.3 years' + interval '0.4 years' - interval '0.7 years';
?column?
----------
-1 mon
and with the patch it is:
SELECT interval '0.3 years' + interval '0.4 years' - interval '0.7 years';
?column?
----------
1 mon
What it isn't, is zero months, which is the obviously ideal answer.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Fri, Apr 2, 2021 at 07:06:08PM -0700, Zhihong Yu wrote:
Hi,
The mix of interval and comparison with float is not easy to interpret. See the
following (I got 0.0833 since the result for�interval '0.3 years' + interval
'0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 *
'1 year'::interval = '0.0833 year'::interval;
�?column?
----------
�fAs long as Bruce's patch makes improvements over the current behavior, I think
that's fine.
I wish I could figure out how to improve it any futher. What is odd is
that I have never seen this reported as a problem before. I plan to
apply this early next week for PG 14.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Bruce:
In src/interfaces/ecpg/pgtypeslib/interval.c, how about the following
places ?
Around line 158:
case 'Y':
tm->tm_year += val;
tm->tm_mon += (fval * MONTHS_PER_YEAR);
Around line 194:
tm->tm_year += val;
tm->tm_mon += (fval * MONTHS_PER_YEAR);
Is rint() needed for these two cases ?
Cheers
On Fri, Apr 2, 2021 at 7:21 PM Bruce Momjian <bruce@momjian.us> wrote:
Show quoted text
On Fri, Apr 2, 2021 at 07:06:08PM -0700, Zhihong Yu wrote:
Hi,
The mix of interval and comparison with float is not easy to interpret.See the
following (I got 0.0833 since the result for interval '0.3 years' +
interval
'0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).
yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval -
0.7 *
'1 year'::interval = '0.0833 year'::interval;
?column?
----------
fAs long as Bruce's patch makes improvements over the current behavior, I
think
that's fine.
I wish I could figure out how to improve it any futher. What is odd is
that I have never seen this reported as a problem before. I plan to
apply this early next week for PG 14.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
On Fri, Apr 2, 2021 at 07:53:35PM -0700, Zhihong Yu wrote:
Bruce:
In�src/interfaces/ecpg/pgtypeslib/interval.c, how�about the following places ?Around line 158:
� � � � � � � � case 'Y':
� � � � � � � � � � tm->tm_year += val;
� � � � � � � � � � tm->tm_mon += (fval * MONTHS_PER_YEAR);Around line 194:
� � � � � � � � � � tm->tm_year += val;
� � � � � � � � � � tm->tm_mon += (fval * MONTHS_PER_YEAR);Is rint() needed for these two cases ?
Ah, yes, good point. Updated patch attached.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..d5b3705145 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -526,7 +526,6 @@ AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
extra_days = (int) frac;
tm->tm_mday += extra_days;
frac -= extra_days;
- AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
}
/* Fetch a fractional-second value with suitable error checking */
@@ -3307,28 +3306,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3564,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3600,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 4245016c8e..fccb9765ae 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,7 +155,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -191,7 +191,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -529,28 +529,28 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index c5ffa9f2cc..bd35c930b0 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -34,10 +34,10 @@ SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-1 day +02:03:00
(1 row)
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
- Ten days twelve hours
------------------------
- 10 days 12:00:00
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
+ Ten days
+----------
+ 10 days
(1 row)
SELECT INTERVAL '1.5 months' AS "One month 15 days";
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 11c1929bef..bd141e7b52 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -11,7 +11,7 @@ SELECT INTERVAL '+02:00' AS "Two hours";
SELECT INTERVAL '-08:00' AS "Eight hours";
SELECT INTERVAL '-1 +02:03' AS "22 hours ago...";
SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
SELECT INTERVAL '1.5 months' AS "One month 15 days";
SELECT INTERVAL '10 years -11 month -12 days +13:14' AS "9 years...";
On Fri, Apr 02, 2021 at 10:21:26PM -0400, Bruce Momjian wrote:
I wish I could figure out how to improve it any futher. What is odd is
that I have never seen this reported as a problem before. I plan to
apply this early next week for PG 14.
On Fri, Apr 02, 2021 at 01:05:42PM -0700, Bryn Llewellyn wrote:
bruce@momjian.us wrote:
Yes, looking at the code, it seems we only spill down to one unit, not more. I think we need to have a discussion if we want to change that.
If this is a bug, then there's no deadline - and if you're proposing a behavior
change, then I don't think it's a good time to begin the discussion.
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
Your patch changes what seems to be the intended behavior, with the test case
added by:
|commit 57bfb27e60055c10e42b87e68a894720c2f78e70
|Author: Tom Lane <tgl@sss.pgh.pa.us>
|Date: Mon Sep 4 01:26:28 2006 +0000
|
| Fix interval input parser so that fractional weeks and months are
| cascaded first to days and only what is leftover into seconds. This
And documented by:
|commit dbf57d31f8d7bf5c058a9fab2a1ccb4a336864ce
|Author: Tom Lane <tgl@sss.pgh.pa.us>
|Date: Sun Nov 9 17:09:48 2008 +0000
|
| Add some documentation about handling of fractions in interval input.
| (It's always worked like this, but we never documented it before.)
If you were to change the behavior, I think you'd have to update the
documentation, too - but I think that's not a desirable change.
I *am* curious why the YEAR, DECADE, CENTURY, AND MILLENIUM cases only handle
fractional intervals down to the next smaller unit, and not down to
seconds/milliseconds. I wrote a patch to handle that by adding
AdjustFractMons(), if we agree that it's desirable to change.
--
Justin
Import Notes
Reply to msg id not found: E34DA978-0C0E-4F43-97CD-278FA0C228E5@yugabyte.com20210403022126.GG29126@momjian.us | Resolved by subject fallback
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
On Fri, Apr 02, 2021 at 10:21:26PM -0400, Bruce Momjian wrote:
I wish I could figure out how to improve it any futher. What is odd is
that I have never seen this reported as a problem before. I plan to
apply this early next week for PG 14.On Fri, Apr 02, 2021 at 01:05:42PM -0700, Bryn Llewellyn wrote:
bruce@momjian.us wrote:
Yes, looking at the code, it seems we only spill down to one unit, not more. I think we need to have a discussion if we want to change that.
If this is a bug, then there's no deadline - and if you're proposing a behavior
change, then I don't think it's a good time to begin the discussion.
Well, bug or not, we are not going to change back branches for this, and
if you want a larger discussion, it will have to wait for PG 15.
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.
Your patch changes what seems to be the intended behavior, with the test case
added by:|commit 57bfb27e60055c10e42b87e68a894720c2f78e70
|Author: Tom Lane <tgl@sss.pgh.pa.us>
|Date: Mon Sep 4 01:26:28 2006 +0000
|
| Fix interval input parser so that fractional weeks and months are
| cascaded first to days and only what is leftover into seconds. ThisAnd documented by:
|commit dbf57d31f8d7bf5c058a9fab2a1ccb4a336864ce
|Author: Tom Lane <tgl@sss.pgh.pa.us>
|Date: Sun Nov 9 17:09:48 2008 +0000
|
| Add some documentation about handling of fractions in interval input.
| (It's always worked like this, but we never documented it before.)If you were to change the behavior, I think you'd have to update the
documentation, too - but I think that's not a desirable change.
I *am* curious why the YEAR, DECADE, CENTURY, AND MILLENIUM cases only handle
fractional intervals down to the next smaller unit, and not down to
seconds/milliseconds. I wrote a patch to handle that by adding
AdjustFractMons(), if we agree that it's desirable to change.
The interaction of months/days/seconds is so imprecise that passing it
futher down doesn't make much sense, and suggests a precision that
doesn't exist, but if people prefer that we can do it.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 7c341c8e3f..6b50fb849f 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2775,7 +2775,7 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
<literal>'1.5 week'</literal> or <literal>'01:02:03.45'</literal>. Such input is
converted to the appropriate number of months, days, and seconds
for storage. When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
+ months or days, the fraction is added to the next lower-order internal field
using the conversion factors 1 month = 30 days and 1 day = 24 hours.
For example, <literal>'1.5 month'</literal> becomes 1 month and 15 days.
Only seconds will ever be shown as fractional on output.
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..d5b3705145 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -526,7 +526,6 @@ AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
extra_days = (int) frac;
tm->tm_mday += extra_days;
frac -= extra_days;
- AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
}
/* Fetch a fractional-second value with suitable error checking */
@@ -3307,28 +3306,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3564,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3600,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 4245016c8e..fccb9765ae 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,7 +155,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -191,7 +191,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -529,28 +529,28 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index c5ffa9f2cc..bd35c930b0 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -34,10 +34,10 @@ SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-1 day +02:03:00
(1 row)
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
- Ten days twelve hours
------------------------
- 10 days 12:00:00
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
+ Ten days
+----------
+ 10 days
(1 row)
SELECT INTERVAL '1.5 months' AS "One month 15 days";
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 11c1929bef..bd141e7b52 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -11,7 +11,7 @@ SELECT INTERVAL '+02:00' AS "Two hours";
SELECT INTERVAL '-08:00' AS "Eight hours";
SELECT INTERVAL '-1 +02:03' AS "22 hours ago...";
SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
SELECT INTERVAL '1.5 months' AS "One month 15 days";
SELECT INTERVAL '10 years -11 month -12 days +13:14' AS "9 years...";
On Mon, Apr 05, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.
I should have adjusted the quote to include context:
| In the verbose input format, and in SOME FIELDS of the more compact input formats, field values can have fractional parts[...]
I don't know what "some fields" means - more clarity here would help indicate
the intended behavior.
The interaction of months/days/seconds is so imprecise that passing it
futher down doesn't make much sense, and suggests a precision that
doesn't exist, but if people prefer that we can do it.
I agree on its face that "months" is imprecise (30, 31, 27, 28 days),
especially fractional months, and same for "years" (leap years), and hours per
day (DST), but even minutes ("leap seconds"). But the documentation seems to
be clear about the behavior:
| .. using the conversion factors 1 month = 30 days and 1 day = 24 hours
I think the most obvious/consistent change is for years and greater to "cascade
down" to seconds, and not just months.
--
Justin
On Mon, Apr 5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote:
On Mon, Apr 05, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.I should have adjusted the quote to include context:
| In the verbose input format, and in SOME FIELDS of the more compact input formats, field values can have fractional parts[...]
I don't know what "some fields" means - more clarity here would help indicate
the intended behavior.
I assume it is comparing the verbose format to the ISO 8601 time
intervals format, which I have not looked at. Interesting I see this as
a C comment at the top of DecodeISO8601Interval();
* A couple exceptions from the spec:
* - a week field ('W') may coexist with other units
--> * - allows decimals in fields other than the least significant unit.
I don't actually see anything in our code that doesn't support factional
values, so maybe the docs are wrong and need to be fixed.
Actually, according to our regression tests, this fails:
SELECT '5.5 seconds 3 milliseconds'::interval;
ERROR: invalid input syntax for type interval: "5.5 seconds 3 milliseconds"
but that is the verbose format, I think.
The interaction of months/days/seconds is so imprecise that passing it
futher down doesn't make much sense, and suggests a precision that
doesn't exist, but if people prefer that we can do it.I agree on its face that "months" is imprecise (30, 31, 27, 28 days),
especially fractional months, and same for "years" (leap years), and hours per
day (DST), but even minutes ("leap seconds"). But the documentation seems to
be clear about the behavior:| .. using the conversion factors 1 month = 30 days and 1 day = 24 hours
I think the most obvious/consistent change is for years and greater to "cascade
down" to seconds, and not just months.
Wow, well, that is _an_ option. Would people like that? It is certainly
easier to explain.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On 05-Apr-2021, at 11:37, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote:
On Mon, Apr 05, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-INTERVAL-INPUT&source=gmail-imap&ust=1618252677000000&usg=AOvVaw34LnV9DlK4pcYY5NJGQe-m
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.I should have adjusted the quote to include context:
| In the verbose input format, and in SOME FIELDS of the more compact input formats, field values can have fractional parts[...]
I don't know what "some fields" means - more clarity here would help indicate
the intended behavior.I assume it is comparing the verbose format to the ISO 8601 time
intervals format, which I have not looked at. Interesting I see this as
a C comment at the top of DecodeISO8601Interval();* A couple exceptions from the spec:
* - a week field ('W') may coexist with other units
--> * - allows decimals in fields other than the least significant unit.I don't actually see anything in our code that doesn't support factional
values, so maybe the docs are wrong and need to be fixed.Actually, according to our regression tests, this fails:
SELECT '5.5 seconds 3 milliseconds'::interval;
ERROR: invalid input syntax for type interval: "5.5 seconds 3 milliseconds"but that is the verbose format, I think.
The interaction of months/days/seconds is so imprecise that passing it
futher down doesn't make much sense, and suggests a precision that
doesn't exist, but if people prefer that we can do it.I agree on its face that "months" is imprecise (30, 31, 27, 28 days),
especially fractional months, and same for "years" (leap years), and hours per
day (DST), but even minutes ("leap seconds"). But the documentation seems to
be clear about the behavior:| .. using the conversion factors 1 month = 30 days and 1 day = 24 hours
I think the most obvious/consistent change is for years and greater to "cascade
down" to seconds, and not just months.Wow, well, that is _an_ option. Would people like that? It is certainly
easier to explain.
It seems to me that this whole business is an irrevocable mess. The original design could have brought three overload-distinguishable types, "interval month", "interval day", and "interval second"—each represented internally as a scalar. There could have been built-ins to convert between them using conventionally specified rules. Then interval arithmetic would have been clear. For example, an attempt to assign the difference between two timestamps to anything but "interval second" would cause an error (as it does in Oracle database, even though there there are only two interval kinds). But we can only deal with what we have and accept the fact that the doc will inevitably be tortuous.
Givea this, I agree that fractional years should simply convert to fractional months (to be then added to verbetim-given fractional months) just before representing the months as the trunc() of the value and cascading the remainder down to days. Units like century would fall out naturally in the same way.
ABOUT LEAP SECONDS
Look at this (from Feb 2005):
«
PostgreSQL does not support leap seconds
/messages/by-id/1162319515.20050202141132@mail.ru
»
I don't know if the title reports a state of affairs in the hope that this be changed to bring such support—or whether it simply states what obtains and always will. Anyway, a simple test (below) shows that PG Version 13.2 doesn't honor leap seconds.
DETAIL
First, it helps me to demonstrate, using leap years, that this is a base phenomenon of the proleptic Gregorian calendar that PG uses—and has nothing to do with time zones. (If it did, the then leap year notion could be time zone dependent. Do this
select
'2020-02-29'::date as "date",
'2020-02-29 23:59:59.99999'::timestamp as "plain timestamp";
This is the result:
date | plain timestamp
------------+---------------------------
2020-02-29 | 2020-02-29 23:59:59.99999
Changing the year to 2021 brings the 22008 error "date/time field value out of range". (Of course, you have to split the test into two pieces to be sure that you get the same error with both data types.)
This suggests a test that uses '23:59:60.000000' for the time. However, try this first:
select
'23:59:60.000000'::time as "time",
'2021-04-05 23:59:60.000000'::timestamp as "plain timestamp";
time | plain timestamp
----------+---------------------
24:00:00 | 2021-04-06 00:00:00
This is annoying. It reflects what seems to me to be an unfortunate design choice. Anyway, this behavior will never change. But it means that a precise discussion needs more words than had one minute been taken as a closed-open interval—[0,60) seconds—(with 59.99999 legal and 60.000000 illegal). It's too boring to type all those words here. Just do this:
select '2021-04-05 23:59:60.5'::timestamp as "plain timestamp";
This is the result:
plain timestamp
-----------------------
2021-04-06 00:00:00.5
Of course, there was no leap second (on the planet—never, mind databases) at this moment. The most recent leap second was 2016-12-31 at 23:59:60 (UTC) meaning that '60.000000' through '60.999999' were all meaningful times on 31-Dec that year. So try this:
select '2016-12-31 23:59:60.5'::timestamp as "should be leap second";
This is the result:
should be leap second
-----------------------
2017-01-01 00:00:00.5
This tells me that the subject line of the email from 2005 remains correct: PostgreSQL does not support leap seconds. Given this, we can safely say that one minute is exactly 60 seconds (and that one hour is exactly 60 minutes) and never mention leap seconds ever again. I assume that it's this that must have informed the decision to represent an interval value as the three fields months, days, and seconds.
On Mon, Apr 5, 2021 at 01:06:36PM -0700, Bryn Llewellyn wrote:
On 05-Apr-2021, at 11:37, Bruce Momjian <bruce@momjian.us> wrote On:
Mon, Apr 5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote :It seems to me that this whole business is an irrevocable mess. The
original design could have brought three overload-distinguishable
types, "interval month", "interval day", and "interval second"—each
represented internally as a scalar. There could have been built-ins
to convert between them using conventionally specified rules. Then
interval arithmetic would have been clear. For example, an attempt to
assign the difference between two timestamps to anything but "interval
second" would cause an error (as it does in Oracle database, even
though there there are only two interval kinds). But we can only deal
with what we have and accept the fact that the doc will inevitably be
tortuous.
The problem with making three data types is that someone is going to
want to use a mixture of those, so I am not sure it is a win.
Givea this, I agree that fractional years should simply convert to
fractional months (to be then added to verbetim-given fractional
months) just before representing the months as the trunc() of the
value and cascading the remainder down to days. Units like century
would fall out naturally in the same way.
I am confused --- are you saying we should do the interval addition,
then truncate, because we don't do that now, and it would be hard to do.
ABOUT LEAP SECONDS
Look at this (from Feb 2005):
« PostgreSQL does not support leap seconds
/messages/by-id/1162319515.20050202141132@mail.r
u »I don't know if the title reports a state of affairs in the hope that
this be changed to bring such support—or whether it simply states
what obtains and always will. Anyway, a simple test (below) shows that
PG Version 13.2 doesn't honor leap seconds.
Postgres is documented as not supporting leap seconds:
https://www.postgresql.org/docs/13/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
timezone
The time zone offset from UTC, measured in seconds. Positive values
correspond to time zones east of UTC, negative values to zones west of
UTC. (Technically, PostgreSQL does not use UTC because leap seconds are
not handled.)
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On 05-Apr-2021, at 13:35, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 5, 2021 at 01:06:36PM -0700, Bryn Llewellyn wrote:
On 05-Apr-2021, at 11:37, Bruce Momjian <bruce@momjian.us> wrote On:
Mon, Apr 5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote :It seems to me that this whole business is an irrevocable mess. The
original design could have brought three overload-distinguishable
types, "interval month", "interval day", and "interval second"—each
represented internally as a scalar. There could have been built-ins
to convert between them using conventionally specified rules. Then
interval arithmetic would have been clear. For example, an attempt to
assign the difference between two timestamps to anything but "interval
second" would cause an error (as it does in Oracle database, even
though there there are only two interval kinds). But we can only deal
with what we have and accept the fact that the doc will inevitably be
tortuous.The problem with making three data types is that someone is going to
want to use a mixture of those, so I am not sure it is a win.Givea this, I agree that fractional years should simply convert to
fractional months (to be then added to verbetim-given fractional
months) just before representing the months as the trunc() of the
value and cascading the remainder down to days. Units like century
would fall out naturally in the same way.I am confused --- are you saying we should do the interval addition,
then truncate, because we don't do that now, and it would be hard to do.ABOUT LEAP SECONDS
Look at this (from Feb 2005):
« PostgreSQL does not support leap seconds
https://www.google.com/url?q=https://www.postgresql.org/message-id/1162319515.20050202141132@mail.r&source=gmail-imap&ust=1618259739000000&usg=AOvVaw0lT0Zz_HDsCrF5HrWCjplE
u »I don't know if the title reports a state of affairs in the hope that
this be changed to bring such support—or whether it simply states
what obtains and always will. Anyway, a simple test (below) shows that
PG Version 13.2 doesn't honor leap seconds.Postgres is documented as not supporting leap seconds:
timezone
The time zone offset from UTC, measured in seconds. Positive values
correspond to time zones east of UTC, negative values to zones west of
UTC. (Technically, PostgreSQL does not use UTC because leap seconds are
not handled.)
Thanks for the “leap seconds not supported” link. Google’s search within site refused to find that for me. (Talk about well hidden).
About “ three data [interval] types” it’s too late anyway. So I’ll say no more.
Re “are you saying we should do the interval addition, then truncate, because we don't do that now, and it would be hard to do.” I wan’t thinking of interval addition at all. Simply how the three values that that make up the internal representation are computed from a specified interval value. Like the PL/pgSQL simulation I showed you in an earlier reply. I can't find that in the archive now. So here it is again. Sorry for the repetition.
p.yy, p.mo, p.dd, p.hh, p.mi, and p.ss are th input
m, d, and s are the internal representation
m1 int not null := trunc(p.mo);
m_remainder numeric not null := p.mo - m1::numeric;
m int not null := trunc(p.yy*12) + m1;
d_real numeric not null := p.dd + m_remainder*30.0;
d int not null := floor(d_real);
d_remainder numeric not null := d_real - d::numeric;
s numeric not null := d_remainder*24.0*60.0*60.0 +
p.hh*60.0*60.0 +
p.mi*60.0 +
p.ss;
I have a harness to supply years, months, days, hours, minutes, and seconds values (like the lteral does the,) and to get them back (as "extract" gets them) using the actual implementation and my simulation. The two approaches have never disagreed using a wide range of inputs.
The algorithm that my code shows (esp with both trunc() and float() in play) is too hard to describe in words.
On Mon, Apr 5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
Well, bug or not, we are not going to change back branches for this, and
if you want a larger discussion, it will have to wait for PG 15.https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.
Here is an updated patch, which will be for PG 15. It updates the
documentation to state:
The fractional parts are used to compute appropriate values for the next
lower-order internal fields (months, days, seconds).
It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
From 0b5bf02e5baaa29e74fac9ac3823d593b6c5e3f2 Mon Sep 17 00:00:00 2001
From: Bruce Momjian <bruce@momjian.us>
Date: Thu, 8 Apr 2021 13:07:16 -0400
Subject: [PATCH] interval squash commit
---
doc/src/sgml/datatype.sgml | 16 +++++++---------
src/backend/utils/adt/datetime.c | 13 ++++++-------
src/interfaces/ecpg/pgtypeslib/interval.c | 12 ++++++------
src/test/regress/expected/interval.out | 8 ++++----
src/test/regress/sql/interval.sql | 2 +-
5 files changed, 24 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 7c341c8e3f..9454107e75 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2770,15 +2770,13 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
</para>
<para>
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- <literal>'1.5 week'</literal> or <literal>'01:02:03.45'</literal>. Such input is
- converted to the appropriate number of months, days, and seconds
- for storage. When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, <literal>'1.5 month'</literal> becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts; for example, <literal>'1.5
+ weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
+ parts are used to compute appropriate values for the next lower-order
+ internal fields (months, days, seconds). For example, <literal>'1.5
+ months'</literal> becomes <literal>1 month 15 days</literal>.
+ The fractional conversion factors used are 1 month = 30 days and 1 day
+ = 24 hours. Only seconds will ever be shown as fractional on output.
</para>
<para>
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..d5b3705145 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -526,7 +526,6 @@ AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
extra_days = (int) frac;
tm->tm_mday += extra_days;
frac -= extra_days;
- AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
}
/* Fetch a fractional-second value with suitable error checking */
@@ -3307,28 +3306,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3564,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3600,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 4245016c8e..fccb9765ae 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,7 +155,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -191,7 +191,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -529,28 +529,28 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index 0191949137..3f948f56c0 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -34,10 +34,10 @@ SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-1 day +02:03:00
(1 row)
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
- Ten days twelve hours
------------------------
- 10 days 12:00:00
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
+ Ten days
+----------
+ 10 days
(1 row)
SELECT INTERVAL '1.5 months' AS "One month 15 days";
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 6d532398bd..58bc06178c 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -11,7 +11,7 @@ SELECT INTERVAL '+02:00' AS "Two hours";
SELECT INTERVAL '-08:00' AS "Eight hours";
SELECT INTERVAL '-1 +02:03' AS "22 hours ago...";
SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
SELECT INTERVAL '1.5 months' AS "One month 15 days";
SELECT INTERVAL '10 years -11 month -12 days +13:14' AS "9 years...";
--
2.20.1
On 08-Apr-2021, at 10:24, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
Well, bug or not, we are not going to change back branches for this, and
if you want a larger discussion, it will have to wait for PG 15.https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-INTERVAL-INPUT&source=gmail-imap&ust=1618507489000000&usg=AOvVaw2h2TNbK7O41zsDn8HfD88C
« …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.Here is an updated patch, which will be for PG 15. It updates the
documentation to state:The fractional parts are used to compute appropriate values for the next
lower-order internal fields (months, days, seconds).It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.
Thank you Bruce. I look forward to documenting this new algorithm for YugabyteDB. The algorithm implements the transformation from this:
[
yy_in numeric,
mo_in numeric,
dd_in numeric,
hh_in numeric,
mi_in numeric,
ss_in numeric
]
to this:
[
mo_internal_representation int,
dd_internal_representation int,
ss_internal_representation numeric(1000,6)
]
I am convinced that a prose account of the algorithm, by itself, is not the best way to tell the reader the rules that the algorithm implements. Rather, psuedocode is needed. I mentioned before that, better still, is actual executable PL/pgSQL code. (I can expect readers to be fluent in PL/pgSQL.) Given this executable simulation, an informal prose sketch of what it does will definitely add value.
May I ask you to fill in the body of this stub by translating the C that you have in hand?
create type internal_representation_t as(
mo_internal_representation int,
dd_internal_representation int,
ss_internal_representation numeric(1000,6));
create function internal_representation(
yy_in numeric default 0,
mo_in numeric default 0,
dd_in numeric default 0,
hh_in numeric default 0,
mi_in numeric default 0,
ss_in numeric default 0)
returns internal_representation_t
language plpgsql
as $body$
declare
mo_internal_representation int not null := 0;
dd_internal_representation int not null := 0;
ss_internal_representation numeric not null := 0.0;
ok constant boolean :=
(yy_in is not null) and
(mo_in is not null) and
(dd_in is not null) and
(hh_in is not null) and
(mi_in is not null) and
(ss_in is not null);
begin
assert ok, 'No actual argument, when provided, may be null';
-- The algorithm.
return (mo_internal_representation, dd_internal_representation, ss_internal_representation)::internal_representation_t;
end;
$body$;
By the way, I believe that a user might well decide always to supply all the fields in a "from text to interval" typecast, except for the seconds, as integral values. This, after all, is what the "make_interval()" function enforces. But, because the typecast approach allows non-integral values, the reference documentation must explain the rules unambiguously so that the reader can predict the outcome of any ad hoc test that they might try.
It's a huge pity that the three values of the internal representation cannot be observed directly using SQL because each behaves with different semantics when an interval value is added to a timestamptz value. However, as a second best (and knowing the algorithm above), a user can create interval values where only one of the three fields is populated and test their understanding of the semantic rules that way.
On Thu, Apr 8, 2021 at 10:24 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
Well, bug or not, we are not going to change back branches for this, and
if you want a larger discussion, it will have to wait for PG 15.https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week' or
'01:02:03.45'. Such input is converted to the appropriate number of months,
days, and seconds for storage. When this would result in a fractional
number of months or days, the fraction is added to the lower-order fields
using the conversion factors 1 month = 30 days and 1 day = 24 hours. For
example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be
shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.Here is an updated patch, which will be for PG 15. It updates the
documentation to state:The fractional parts are used to compute appropriate values for
the next
lower-order internal fields (months, days, seconds).It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
+1 to this patch.
On Sun, Apr 11, 2021 at 12:57 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Thu, Apr 8, 2021 at 10:24 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
Well, bug or not, we are not going to change back branches for this, and
if you want a larger discussion, it will have to wait for PG 15.https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week'
or '01:02:03.45'. Such input is converted to the appropriate number of
months, days, and seconds for storage. When this would result in a
fractional number of months or days, the fraction is added to the
lower-order fields using the conversion factors 1 month = 30 days and 1 day
= 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only
seconds will ever be shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.Here is an updated patch, which will be for PG 15. It updates the
documentation to state:The fractional parts are used to compute appropriate values for
the next
lower-order internal fields (months, days, seconds).It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
+1 to this patch.
Bryn reminded me, off list, about the flowing down from fractional
day after the patch.
Before Bruce confirms the removal of the flowing down from fractional day,
I withhold my previous +1.
Bryn would respond with more details.
Cheers
On Sun, Apr 11, 2021 at 4:33 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Sun, Apr 11, 2021 at 12:57 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Thu, Apr 8, 2021 at 10:24 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
Well, bug or not, we are not going to change back branches for this,and
if you want a larger discussion, it will have to wait for PG 15.
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
« …field values can have fractional parts; for example '1.5 week'
or '01:02:03.45'. Such input is converted to the appropriate number of
months, days, and seconds for storage. When this would result in a
fractional number of months or days, the fraction is added to the
lower-order fields using the conversion factors 1 month = 30 days and 1 day
= 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only
seconds will ever be shown as fractional on output. »I see that. What is not clear here is how far we flow down. I was
looking at adding documentation or regression tests for that, but was
unsure. I adjusted the docs slightly in the attached patch.Here is an updated patch, which will be for PG 15. It updates the
documentation to state:The fractional parts are used to compute appropriate values for
the next
lower-order internal fields (months, days, seconds).It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
+1 to this patch.
Bryn reminded me, off list, about the flowing down from fractional
day after the patch.Before Bruce confirms the removal of the flowing down from fractional day,
I withhold my previous +1.Bryn would respond with more details.
Cheers
Among previous examples given by Bryn, the following produces correct
result based on Bruce's patch.
# select interval '-1.7 years 29.4 months';
interval
----------------
9 mons 12 days
(1 row)
Cheers
On Sun, Apr 11, 2021 at 07:33:34PM -0700, Zhihong Yu wrote:
Among previous examples given by Bryn, the following produces correct result
based on Bruce's patch.# select interval '-1.7 years 29.4 months';
� � interval
----------------
�9 mons 12 days
Yes, that changed is caused by the rounding fixes, and not by the unit
pushdown adjustments.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
bruce@momjian.us wrote:
zyu@yugabyte.com wrote:
Among previous examples given by Bryn, the following produces correct result based on Bruce's patch.
# select interval '-1.7 years 29.4 months';
interval
----------------
9 mons 12 daysYes, that changed is caused by the rounding fixes, and not by the unit pushdown adjustments.
I showed you all this example a long time ago:
select (
'
3.853467 years
'::interval
)::text as i;
This behavior is the same in the env. of Bruce’s patch as in unpatched PG 13.2. This is the result.
3 years 10 mons
Notice that "3.853467 years" is "3 years" plus "10.241604 months". This explains the "10 mons" in the result. But the 0.241604 months remainder did not spill down into days.
Can anybody defend this quirk? An extension of this example with a real number of month in the user input is correspondingly yet more quirky. The rules can be written down. But they’re too tortuos to allow an ordinary mortal confidently to design code that relies on them.
(I was unable to find any rule statement that lets the user predict this in the doc. But maybe that’s because of my feeble searching skills.)
If there is no defense (and I cannot imagine one) might Bruce’s patch normalize this too to follow this rule:
— convert 'y years m months' to the real number y*12 + m.
— record truc( y*12 + m) in the "months" field of the internal representation
— flow the remainder down to days (but no further)
After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?
On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
I showed you all this example a long time ago:
select (
'
3.853467 years
'::interval
)::text as i;This behavior is the same in the env. of Bruce’s patch as in unpatched PG 13.2. This is the result.
3 years 10 mons
Notice that "3.853467 years" is "3 years" plus "10.241604 months". This explains the "10 mons" in the result. But the 0.241604 months remainder did not spill down into days.
Can anybody defend this quirk? An extension of this example with a real number of month in the user input is correspondingly yet more quirky. The rules can be written down. But they’re too tortuos to allow an ordinary mortal confidently to design code that relies on them.
(I was unable to find any rule statement that lets the user predict this in the doc. But maybe that’s because of my feeble searching skills.)
If there is no defense (and I cannot imagine one) might Bruce’s patch normalize this too to follow this rule:
— convert 'y years m months' to the real number y*12 + m.
— record truc( y*12 + m) in the "months" field of the internal representation
— flow the remainder down to days (but no further)
After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?
The docs now say:
Field values can have fractional parts; for example, <literal>'1.5
weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
--> parts are used to compute appropriate values for the next lower-order
internal fields (months, days, seconds).
meaning fractional years flows to the next lower internal unit, months,
and no further. Fractional months would flow to days. The idea of not
flowing past the next lower-order internal field is that the
approximations between units are not precise enough to flow accurately.
With my patch, the output is now:
SELECT INTERVAL '3 years 10.241604 months';
interval
------------------------
3 years 10 mons 7 days
It used to flow to seconds.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Bruce Momjian <bruce@momjian.us> writes:
On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?
The docs now say:
Field values can have fractional parts; for example, <literal>'1.5
weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
--> parts are used to compute appropriate values for the next lower-order
internal fields (months, days, seconds).
meaning fractional years flows to the next lower internal unit, months,
and no further. Fractional months would flow to days. The idea of not
flowing past the next lower-order internal field is that the
approximations between units are not precise enough to flow accurately.
Um, what's the argument for down-converting AT ALL? The problem is
precisely that any such conversion is mostly fictional.
With my patch, the output is now:
SELECT INTERVAL '3 years 10.241604 months';
interval
------------------------
3 years 10 mons 7 days
It used to flow to seconds.
Yeah, that's better than before, but I don't see any principled argument
for it not to be "3 years 10 months", full stop.
regards, tom lane
On Mon, Apr 12, 2021 at 07:38:21PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?
The docs now say:
Field values can have fractional parts; for example, <literal>'1.5
weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
--> parts are used to compute appropriate values for the next lower-order
internal fields (months, days, seconds).meaning fractional years flows to the next lower internal unit, months,
and no further. Fractional months would flow to days. The idea of not
flowing past the next lower-order internal field is that the
approximations between units are not precise enough to flow accurately.Um, what's the argument for down-converting AT ALL? The problem is
precisely that any such conversion is mostly fictional.
True.
With my patch, the output is now:
SELECT INTERVAL '3 years 10.241604 months';
interval
------------------------
3 years 10 mons 7 daysIt used to flow to seconds.
Yeah, that's better than before, but I don't see any principled argument
for it not to be "3 years 10 months", full stop.
Well, the case was:
SELECT INTERVAL '0.1 months';
interval
----------
3 days
SELECT INTERVAL '0.1 months' + interval '0.9 months';
?column?
----------
30 days
If you always truncate, you basically lose the ability to specify
fractional internal units, which I think is useful. I would say if you
use fractional units of one of the internal units, you are basically
knowing you are asking for an approximation --- that is not true of '3.5
years', for example.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
tgl@sss.pgh.pa.us wrote:
bruce@momjian.us writes:
bryn@yugabyte.com wrote:
After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?
The docs now say:
Field values can have fractional parts; for example, <literal>'1.5
weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
--> parts are used to compute appropriate values for the next lower-order
internal fields (months, days, seconds).meaning fractional years flows to the next lower internal unit, months, and no further. Fractional months would flow to days. The idea of not flowing past the next lower-order internal field is that the approximations between units are not precise enough to flow accurately.
Um, what's the argument for down-converting AT ALL? The problem is precisely that any such conversion is mostly fictional.
With my patch, the output is now:
SELECT INTERVAL '3 years 10.241604 months';
interval
------------------------
3 years 10 mons 7 daysIt used to flow to seconds.
Yeah, that's better than before, but I don't see any principled argument for it not to be "3 years 10 months", full stop.
Tom, I fully support your suggestion to have no flow down at all. Please may this happen! However, the new rule must be described in terms of the three fields of the internal representation: [months, days, seconds]. This representation is already documented.
Don’t forget that '731.42587 hours’ is read back as "731:25:33.132" (or, if you prefer, 731 hours, 25 minutes, and 33.132 seconds if you use "extract" and your own pretty print). But we don’t think of this as “flow down”. Rather, it’s just a conventional representation of the seconds field of the internal representation. I could go on. But you all know what I mean.
By the way, I made a nice little demo (for my doc project). It shows that:
(1) if you pick the right date-time just before a DST change, and do the addition in the right time zone, then adding 24 hours gets a different answer than adding one day.
(2) if you pick the right date-time just before 29-Feb in a leap year, then adding 30 days gets a different answer than adding one month.
You all know why. And though the doc could explain and illustrate this better, it does tell you to expect this. It also says that the difference in semantics that these examples show is the reason for the three-field internal representation.
It seems to me that both the age-old behavior that vanilla 13.2 exhibits, and the behavior in the regime of Bruce’s patch are like adding 2.2 oranges to 1.3 oranges and getting 3 oranges and 21 apples (because 1 orange is conventionally the same as 42 apples). Bruce made a step in the right direction by stopping oranges convert all the way down to bananas. But it would be so much better to get rid of this false equivalence business altogether.
On 12-Apr-2021, at 17:00, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 12, 2021 at 07:38:21PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?
The docs now say:
Field values can have fractional parts; for example, <literal>'1.5
weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
--> parts are used to compute appropriate values for the next lower-order
internal fields (months, days, seconds).meaning fractional years flows to the next lower internal unit, months,
and no further. Fractional months would flow to days. The idea of not
flowing past the next lower-order internal field is that the
approximations between units are not precise enough to flow accurately.Um, what's the argument for down-converting AT ALL? The problem is
precisely that any such conversion is mostly fictional.True.
With my patch, the output is now:
SELECT INTERVAL '3 years 10.241604 months';
interval
------------------------
3 years 10 mons 7 daysIt used to flow to seconds.
Yeah, that's better than before, but I don't see any principled argument
for it not to be "3 years 10 months", full stop.Well, the case was:
SELECT INTERVAL '0.1 months';
interval
----------
3 daysSELECT INTERVAL '0.1 months' + interval '0.9 months';
?column?
----------
30 daysIf you always truncate, you basically lose the ability to specify
fractional internal units, which I think is useful. I would say if you
use fractional units of one of the internal units, you are basically
knowing you are asking for an approximation --- that is not true of '3.5
years', for example.
I’d argue that the fact that this:
('0.3 months'::interval) + ('0.7 months'::interval)
Is reported as '30 days' and not '1 month' is yet another bug—precisely because of what I said in my previous email (sorry that I forked the thread) where I referred to the fact that, in the right test, adding 1 month gets a different answer than adding 30 days. Yet another convincing reason to get rid of this flow down business altogether.
If some application wants to model flow-down, then it can do so with trivial programming and full control over its own definition of the rules.
On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
I’d argue that the fact that this:
('0.3 months'::interval) + ('0.7 months'::interval)
Is reported as '30 days' and not '1 month' is yet another
bug—precisely because of what I said in my previous email (sorry
that I forked the thread) where I referred to the fact that, in the
right test, adding 1 month gets a different answer than adding 30
days.
Flowing _up_ is what these functions do:
\df *justify*
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+------------------+------------------+---------------------+------
pg_catalog | justify_days | interval | interval | func
pg_catalog | justify_hours | interval | interval | func
pg_catalog | justify_interval | interval | interval | func
Yet another convincing reason to get rid of this flow down
business altogether.
We can certainly get rid of all downflow, which in the current patch is
only when fractional internal units are specified.
If some application wants to model flow-down, then it can do so with
trivial programming and full control over its own definition of the
rules.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Mon, Apr 12, 2021 at 4:22 PM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
I showed you all this example a long time ago:
select (
'
3.853467 years
'::interval
)::text as i;This behavior is the same in the env. of Bruce’s patch as in unpatched
PG 13.2. This is the result.
3 years 10 mons
Notice that "3.853467 years" is "3 years" plus "10.241604 months". This
explains the "10 mons" in the result. But the 0.241604 months remainder did
not spill down into days.Can anybody defend this quirk? An extension of this example with a real
number of month in the user input is correspondingly yet more quirky. The
rules can be written down. But they’re too tortuos to allow an ordinary
mortal confidently to design code that relies on them.(I was unable to find any rule statement that lets the user predict this
in the doc. But maybe that’s because of my feeble searching skills.)
If there is no defense (and I cannot imagine one) might Bruce’s patch
normalize this too to follow this rule:
— convert 'y years m months' to the real number y*12 + m.
— record truc( y*12 + m) in the "months" field of the internal
representation
— flow the remainder down to days (but no further)
After all, you've bitten the bullet now and changed the behavior. This
means that the semantics of some extant applications will change. So... in
for a penny, in for a pound?The docs now say:
Field values can have fractional parts; for example, <literal>'1.5
weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
--> parts are used to compute appropriate values for the next lower-order
internal fields (months, days, seconds).meaning fractional years flows to the next lower internal unit, months,
and no further. Fractional months would flow to days. The idea of not
flowing past the next lower-order internal field is that the
approximations between units are not precise enough to flow accurately.With my patch, the output is now:
SELECT INTERVAL '3 years 10.241604 months';
interval
------------------------
3 years 10 mons 7 daysIt used to flow to seconds.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Based on the results of more samples, I restore +1 to Bruce's latest patch.
Cheers
On 12-Apr-2021, at 17:25, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
I’d argue that the fact that this:
('0.3 months'::interval) + ('0.7 months'::interval)
Is reported as '30 days' and not '1 month' is yet another
bug—precisely because of what I said in my previous email (sorry
that I forked the thread) where I referred to the fact that, in the
right test, adding 1 month gets a different answer than adding 30
days.Flowing _up_ is what these functions do:
\df *justify*
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+------------------+------------------+---------------------+------
pg_catalog | justify_days | interval | interval | func
pg_catalog | justify_hours | interval | interval | func
pg_catalog | justify_interval | interval | interval | funcYet another convincing reason to get rid of this flow down
business altogether.We can certainly get rid of all downflow, which in the current patch is
only when fractional internal units are specified.If some application wants to model flow-down, then it can do so with
trivial programming and full control over its own definition of the
rules.
“Yes please!” re Bruce’s “We can certainly get rid of all downflow, which in the current patch is only when fractional internal units are specified.”
Notice that a user who creates interval values explicitly only by using “make_interval()” will see no behavior change.
There’s another case of up-flow. When you subtract one timestamp value from another, and when they’re far enough apart, then the (internal representation of the) resulting interval value has both a seconds component and a days component. (But never, in my tests, a months component.) I assume that the implementation first gets the difference between the two timestamp values in seconds using (the moral equivalent of) “extract epoch”. And then, if this is greater than 24*60*60, it implements up-flow using the “rule-of-24”—never mind that this means that if you add the answer back to the timestamp value that you subtracted, then you might not get the timestamp value from which you subtracted. (This happens around a DST change and has been discussed earlier in the thread.)
The purpose of these three “justify” functions is dubious. I think that it’s best to think of the 3-component interval vector like an [x, y, z] vector in 3d geometric space, where the three coordinates are not mutually convertible because each has unique semantics. This point has been rehearsed many times in this long thread.
Having said this, it isn’t hard to understand the rules that the functions implement. And, most importantly, their use is voluntary. They are, though, no more than shipped and documented wrappers for a few special cases. A user could so easily write their own function like this:
1. Compute the values of the three components of the internal representation of the passed-in interval value using the “extract” feature and some simple arithmetic.
2. Derive the [minutes, days, seconds] values of a new representation using whatever rules you feel for.
3. Use these new values to create the return interval value.
For example, I might follow a discipline to use interval values that have only one of the three components of the internal representation non-zero. It’s easy to use the “domain” feature for this. (I can use these in any context where I can use the shipped interval.) I could then write a function to convert a “pure seconds” value of my_interval to a “pure years” value. And I could implement my own rules:
— Makes sense only for a large number of seconds that comes out to at least five years (else assert failure).
— Converts seconds to years using the rule that 1 year is, on average, 365.25*24*60*60 seconds, and then truncates it.
There’s no shipped function that does this, and this makes me suspect that I’d prefer to roll my own for any serious purpose.
The existence of the three “justify” functions is, therefore, harmless.
On Fri, May 7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
There’s no shipped function that does this, and this makes me suspect that
I’d prefer to roll my own for any serious purpose.The existence of the three “justify” functions is, therefore, harmless.
Bruce / Tom:
Can we revisit this topic ?
I thought we agreed that the attached patch will be applied to PG 15.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 0e8ef958a9..0830128013 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2770,15 +2770,13 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
</para>
<para>
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- <literal>'1.5 week'</literal> or <literal>'01:02:03.45'</literal>. Such input is
- converted to the appropriate number of months, days, and seconds
- for storage. When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, <literal>'1.5 month'</literal> becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts; for example, <literal>'1.5
+ weeks'</literal> or <literal>'01:02:03.45'</literal>. The fractional
+ parts are used to compute appropriate values for the next lower-order
+ internal fields (months, days, seconds). For example, <literal>'1.5
+ months'</literal> becomes <literal>1 month 15 days</literal>.
+ The fractional conversion factors used are 1 month = 30 days and 1 day
+ = 24 hours. Only seconds will ever be shown as fractional on output.
</para>
<para>
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 54ae632de2..fb9a753223 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -526,7 +526,6 @@ AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
extra_days = (int) frac;
tm->tm_mday += extra_days;
frac -= extra_days;
- AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
}
/* Fetch a fractional-second value with suitable error checking */
@@ -3307,28 +3306,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3564,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3600,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 02b3c47223..0b2fccbbe6 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,7 +155,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -191,7 +191,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -529,28 +529,28 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index e4b1246f45..e8bc7a2337 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -34,10 +34,10 @@ SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-1 days +02:03:00
(1 row)
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
- Ten days twelve hours
------------------------
- 10 days 12:00:00
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
+ Ten days
+----------
+ 10 days
(1 row)
SELECT INTERVAL '1.5 months' AS "One month 15 days";
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 6d532398bd..58bc06178c 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -11,7 +11,7 @@ SELECT INTERVAL '+02:00' AS "Two hours";
SELECT INTERVAL '-08:00' AS "Eight hours";
SELECT INTERVAL '-1 +02:03' AS "22 hours ago...";
SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
-SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
+SELECT INTERVAL '1.5 weeks' AS "Ten days";
SELECT INTERVAL '1.5 months' AS "One month 15 days";
SELECT INTERVAL '10 years -11 month -12 days +13:14' AS "9 years...";
Import Notes
Reply to msg id not found: CALNJ-vRf8MUEhTMRDUJ5o8b_g-pFr8n99-yxaw8qWZX3Bae5XQ@mail.gmail.com
On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
On 12-Apr-2021, at 17:25, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
I’d argue that the fact that this:
('0.3 months'::interval) + ('0.7 months'::interval)
Is reported as '30 days' and not '1 month' is yet another
bug—precisely because of what I said in my previous email (sorry
that I forked the thread) where I referred to the fact that, in the
right test, adding 1 month gets a different answer than adding 30
days.Flowing _up_ is what these functions do:
\df *justify*
List of functions
Schema | Name | Result data type | Argument data types |
Type------------+------------------+------------------+---------------------+------
pg_catalog | justify_days | interval | interval |
func
pg_catalog | justify_hours | interval | interval |
func
pg_catalog | justify_interval | interval | interval |
funcYet another convincing reason to get rid of this flow down
business altogether.We can certainly get rid of all downflow, which in the current patch is
only when fractional internal units are specified.If some application wants to model flow-down, then it can do so with
trivial programming and full control over its own definition of the
rules.*“Yes please!” re Bruce’s “We can certainly get rid of all downflow, which
in the current patch is only when fractional internal units are specified.”*Notice that a user who creates interval values explicitly only by using
“make_interval()” will see no behavior change.There’s another case of up-flow. When you subtract one timestamp value
from another, and when they’re far enough apart, then the (internal
representation of the) resulting interval value has both a seconds
component and a days component. (But never, in my tests, a months
component.) I assume that the implementation first gets the difference
between the two timestamp values in seconds using (the moral equivalent of)
“extract epoch”. And then, if this is greater than 24*60*60, it implements
up-flow using the “rule-of-24”—never mind that this means that if you add
the answer back to the timestamp value that you subtracted, then you might
not get the timestamp value from which you subtracted. (This happens around
a DST change and has been discussed earlier in the thread.)The purpose of these three “justify” functions is dubious. I think that
it’s best to think of the 3-component interval vector like an [x, y, z]
vector in 3d geometric space, where the three coordinates are not mutually
convertible because each has unique semantics. This point has been
rehearsed many times in this long thread.Having said this, it isn’t hard to understand the rules that the functions
implement. And, most importantly, their use is voluntary. They are, though,
no more than shipped and documented wrappers for a few special cases. A
user could so easily write their own function like this:1. Compute the values of the three components of the internal
representation of the passed-in interval value using the “extract” feature
and some simple arithmetic.2. Derive the [minutes, days, seconds] values of a new representation
using whatever rules you feel for.3. Use these new values to create the return interval value.
For example, I might follow a discipline to use interval values that have
only one of the three components of the internal representation non-zero.
It’s easy to use the “domain” feature for this. (I can use these in any
context where I can use the shipped interval.) I could then write a
function to convert a “pure seconds” value of my_interval to a “pure years”
value. And I could implement my own rules:— Makes sense only for a large number of seconds that comes out to at
least five years (else assert failure).— Converts seconds to years using the rule that 1 year is, on
average, 365.25*24*60*60 seconds, and then truncates it.There’s no shipped function that does this, and this makes me suspect that
I’d prefer to roll my own for any serious purpose.The existence of the three “justify” functions is, therefore, harmless.
Bruce / Tom:
Can we revisit this topic ?
Cheers
On Fri, May 7, 2021 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, May 7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com>
wrote:
There’s no shipped function that does this, and this makes me
suspect that
I’d prefer to roll my own for any serious purpose.
The existence of the three “justify” functions is, therefore,
harmless.
Bruce / Tom:
Can we revisit this topic ?I thought we agreed that the attached patch will be applied to PG 15.
Good to know.
Hopefully it lands soon.
Show quoted text
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
On Fri, May 7, 2021 at 07:39:31PM -0700, Zhihong Yu wrote:
On Fri, May 7, 2021 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, May 7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com>
wrote:
There’s no shipped function that does this, and this makes me suspect
that
I’d prefer to roll my own for any serious purpose.
The existence of the three “justify” functions is, therefore,
harmless.
Bruce / Tom:
Can we revisit this topic ?I thought we agreed that the attached patch will be applied to PG 15.
Good to know.
Hopefully it lands soon.
It will be applied in June/July, but will not appear in a release until
Sept/Oct, 2022. Sorry.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
Now that PG 15 is open for commit, do you think the patch can land ?
Adding it to the commitfest patch tracker is a good way to ensure it's not
forgotten about:
https://commitfest.postgresql.org/33/
--
Daniel Gustafsson https://vmware.com/
Import Notes
Reply to msg id not found: CALNJ-vQLukGOX5HKa7Z5tKYfNOyB6u0dU2+SuRFXExoJ_7m8QA@mail.gmail.com
On Fri, May 7, 2021 at 7:42 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, May 7, 2021 at 07:39:31PM -0700, Zhihong Yu wrote:
On Fri, May 7, 2021 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, May 7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com
wrote:
There’s no shipped function that does this, and this makes me
suspect
that
I’d prefer to roll my own for any serious purpose.
The existence of the three “justify” functions is, therefore,
harmless.
Bruce / Tom:
Can we revisit this topic ?I thought we agreed that the attached patch will be applied to PG 15.
Good to know.
Hopefully it lands soon.
It will be applied in June/July, but will not appear in a release until
Sept/Oct, 2022. Sorry.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Bruce:
Now that PG 15 is open for commit, do you think the patch can land ?
Cheers
On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
Now that PG 15 is open for commit, do you think the patch can land ?
Adding it to the commitfest patch tracker is a good way to ensure it's not
forgotten about:
OK, I have been keeping it in my git tree since I wrote it and will
apply it in the next few days.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
Now that PG 15 is open for commit, do you think the patch can land ?
Adding it to the commitfest patch tracker is a good way to ensure it's
not
forgotten about:
OK, I have been keeping it in my git tree since I wrote it and will
apply it in the next few days.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Thanks, Bruce.
Hopefully you can get to this soon.
On Thu, Jul 8, 2021 at 10:22 AM Zhihong Yu <zyu@yugabyte.com> wrote:
On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
Now that PG 15 is open for commit, do you think the patch can land ?
Adding it to the commitfest patch tracker is a good way to ensure it's
not
forgotten about:
OK, I have been keeping it in my git tree since I wrote it and will
apply it in the next few days.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Thanks, Bruce.
Hopefully you can get to this soon.
Bruce:
Please see if the patch can be integrated now.
Cheers
On Wed, Jul 14, 2021 at 09:03:21AM -0700, Zhihong Yu wrote:
On Thu, Jul 8, 2021 at 10:22 AM Zhihong Yu <zyu@yugabyte.com> wrote:
On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us> wrote:On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
Now that PG 15 is open for commit, do you think the patch can land
?
Adding it to the commitfest patch tracker is a good way to ensure
it's not
forgotten about:
OK, I have been keeping it in my git tree since I wrote it and will
apply it in the next few days.
Thanks, Bruce.Hopefully you can get to this soon.
Bruce:
Please see if the patch can be integrated now.
I found a mistake in my most recent patch. For example, in master we
see this output:
SELECT INTERVAL '1.8594 months';
interval
--------------------------
1 mon 25 days 18:46:04.8
Obviously this should return '1 mon 26 days', but with my most recent
patch, it returned '1 mon 25 days'. Turns out I had not properly used
rint() in AdjustFractDays, and in fact the function is now longer needed
because it is just a multiplication and an rint().
Updated patch attached.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
On Mon, Jul 19, 2021 at 9:14 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 14, 2021 at 09:03:21AM -0700, Zhihong Yu wrote:
On Thu, Jul 8, 2021 at 10:22 AM Zhihong Yu <zyu@yugabyte.com> wrote:
On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us>wrote:
On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson
wrote:
On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com>
wrote:
Now that PG 15 is open for commit, do you think the patch
can land
?
Adding it to the commitfest patch tracker is a good way to
ensure
it's not
forgotten about:
OK, I have been keeping it in my git tree since I wrote it and
will
apply it in the next few days.
Thanks, Bruce.Hopefully you can get to this soon.
Bruce:
Please see if the patch can be integrated now.I found a mistake in my most recent patch. For example, in master we
see this output:SELECT INTERVAL '1.8594 months';
interval
--------------------------
1 mon 25 days 18:46:04.8Obviously this should return '1 mon 26 days', but with my most recent
patch, it returned '1 mon 25 days'. Turns out I had not properly used
rint() in AdjustFractDays, and in fact the function is now longer needed
because it is just a multiplication and an rint().Updated patch attached.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Hi,
Patch looks good.
Maybe add the statement above as a test case :
SELECT INTERVAL '1.8594 months'
Cheers
On Tue, Jul 20, 2021 at 02:33:07PM -0700, Zhihong Yu wrote:
On Mon, Jul 19, 2021 at 9:14 PM Bruce Momjian <bruce@momjian.us> wrote:
Obviously this should return '1 mon 26 days', but with my most recent
patch, it returned '1 mon 25 days'. Turns out I had not properly used
rint() in AdjustFractDays, and in fact the function is now longer needed
because it is just a multiplication and an rint().Patch looks good.
Maybe add the statement above as a test case :SELECT INTERVAL '1.8594 months'
Good idea --- updated patch attached.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
On Tue, Jul 20, 2021 at 3:53 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 20, 2021 at 02:33:07PM -0700, Zhihong Yu wrote:
On Mon, Jul 19, 2021 at 9:14 PM Bruce Momjian <bruce@momjian.us> wrote:
Obviously this should return '1 mon 26 days', but with my most
recent
patch, it returned '1 mon 25 days'. Turns out I had not properly
used
rint() in AdjustFractDays, and in fact the function is now longer
needed
because it is just a multiplication and an rint().
Patch looks good.
Maybe add the statement above as a test case :SELECT INTERVAL '1.8594 months'
Good idea --- updated patch attached.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Hi,
With your patch, the following example (Coutesy Bryn) still shows fraction:
# select (interval '1 month')*1.123;
?column?
-----------------------
1 mon 3 days 16:33:36
(1 row)
Do you think the output can be improved (by getting rid of fraction) ?
Thanks
On Tue, Jul 20, 2021 at 05:13:37PM -0700, Zhihong Yu wrote:
On Tue, Jul 20, 2021 at 3:53 PM Bruce Momjian <bruce@momjian.us> wrote:
With your patch, the following example (Coutesy Bryn) still shows fraction:# select (interval '1 month')*1.123;
?column?
-----------------------
1 mon 3 days 16:33:36
(1 row)Do you think the output can be improved (by getting rid of fraction) ?
Well, I focused on how fractional units were processed inside of
interval values. I never considered how multiplication should be
handled. I have not really thought about how to handle that, but this
example now gives me concern:
SELECT INTERVAL '1.06 months 1 hour';
interval
-----------------------
1 mon 2 days 01:00:00
Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
spilling to hours/minutes/seconds, even though hours is already
specified. I don't see a better way to handle this than the current
code already does, but it is something odd.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Wed, 21 Jul 2021 at 03:48, Bruce Momjian <bruce@momjian.us> wrote:
this example now gives me concern:
SELECT INTERVAL '1.06 months 1 hour';
interval
-----------------------
1 mon 2 days 01:00:00Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
spilling to hours/minutes/seconds, even though hours is already
specified. I don't see a better way to handle this than the current
code already does, but it is something odd.
Hmm, looking at this whole thread, I have to say that I prefer the old
behaviour of spilling down to lower units.
For example, with this patch:
SELECT '0.5 weeks'::interval;
interval
----------
4 days
which I don't think is really an improvement. My expectation is that
half a week is 3.5 days, and I prefer what it used to return, namely
'3 days 12:00:00'.
It's true that that leads to odd-looking results when the field value
has lots of fractional digits, but that was at least explainable, and
followed the documentation.
Looking for a general principle to follow, how about this -- the
result of specifying a fractional value should be the same as
multiplying an interval of 1 unit by that value. In other words,
'1.8594 months'::interval should be the same as '1 month'::interval *
1.8594. (Actually, it probably can't easily be made exactly the same
in all cases, due to differences in the floating point computations in
the two cases, and rounding errors, but it's hopefully not far off,
unlike the results obtained by not spilling down to lower units on
input.)
The cases that are broken in master, in my opinion, are the larger
units (year and above), which don't propagate down in the same way as
fractional months and below. So, for example, '0.7 years' should be
8.4 months (with the conversion factor of 1 year = 12 months), giving
'8 months 12 days', which is what '1 year'::interval * 0.7 produces.
Sure, there are arguably more accurate ways of computing that.
However, that's the result obtained using the documented conversion
factors, so it's justifiable in those terms.
It's worth noting another case that is broken in master:
SELECT '1.7 decades'::interval;
interval
------------------
16 years 11 mons
which is surely not what anyone would expect. The current patch fixes
this, but it would also be fixed by handling the fractional digits for
these units in the same way as for smaller units. There was an earlier
patch doing that, I think, though I didn't test it.
Regards,
Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Hmm, looking at this whole thread, I have to say that I prefer the old
behaviour of spilling down to lower units.
For example, with this patch:
SELECT '0.5 weeks'::interval;
interval
----------
4 days
which I don't think is really an improvement. My expectation is that
half a week is 3.5 days, and I prefer what it used to return, namely
'3 days 12:00:00'.
Yeah, that is clearly a significant dis-improvement.
In general, considering that (most of?) the existing behavior has stood
for decades, I think we need to tread VERY carefully about changing it.
I don't want to see this patch changing any case that is not indisputably
broken.
regards, tom lane
On 21-Jul-2021, at 02:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Hmm, looking at this whole thread, I have to say that I prefer the old
behaviour of spilling down to lower units.For example, with this patch:
SELECT '0.5 weeks'::interval;
interval
----------
4 dayswhich I don't think is really an improvement. My expectation is that
half a week is 3.5 days, and I prefer what it used to return, namely
'3 days 12:00:00'.Yeah, that is clearly a significant dis-improvement.
In general, considering that (most of?) the existing behavior has stood
for decades, I think we need to tread VERY carefully about changing it.
I don't want to see this patch changing any case that is not indisputably
broken.
It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:
select interval '-1.7 years'; -- -1 years -8 mons
select interval '29.4 months'; -- 2 years 5 mons 12 days
select interval '-1.7 years 29.4 months'; -- 8 mons 12 days << wrong
select interval '29.4 months -1.7 years'; -- 9 mons 12 daysselect interval '-1.7 years' + interval '29.4 months'; -- 9 mons 12 days
select interval '29.4 months' + interval '-1.7 years'; -- 9 mons 12 days
The consensus was that the outcome that I flagged with “wrong” does indeed have that status. After all, it’s hard to see how anybody could intend this rule (that anyway holds in only some cases):
-a + b <> b - a
It seems odd that there’s been no recent reference to my testcase and how it behaves in the environment of Bruce’s patch.
I don’t recall the history of the thread. But Bruce took on the task of fixing this narrow issue. Anyway, somehow, the whole question of “spill down” came up for discussion. The rules aren’t documented and I’ve been unable to find any reference even to the phenomenon. I have managed to implement a model, in PL/pgSQL, that gets the same results as the native implementation in every one of many tests that I’ve done. I appreciate that this doesn’t prove that my model is correct. But it would seem that it must be on the right track. The rules that my PL/pgSQL uses are self-evidently whimsical—but they were needed precisely to get the same outcomes as the native implementation. There was some discussion of all this somewhere in this thread.
If memory serves, it was Tom who suggested changing the spill-down rules. This was possibly meant entirely rhetorically. But it seems that Bruce did set about implementing a change here. (I was unable to find a clear prose functional spec for the new behavior. Probably I didn’t know where to look.
There’s no doubt that a change in these rules would change the behavior of extant code. But then, in a purist sense, this is the case with any bug fix.
I’m simply waiting on a final ruling and final outcome.
Meanwhile, I’ve worked out a way to tame all this business (by using domain types and associated functionality) so that application code can deal confidently with only pure months, pure days, and pure seconds interval values (thinking of the internal [mm, dd, ss] representation). The scheme ensures that spill-down never occurs by rounding the years or the months field to integral values. If you want to add a “mixed” interval to a timestamp, then you simply add the different kinds of interval in the one expression. And you use parentheses to assert, visibly, the priority rule that you intend.
Because this is ordinary application code, there are no compatibility issues for me. My approach won’t see a change in behavior no matter what is decided about the present patch.
Bryn Llewellyn <bryn@yugabyte.com> writes:
On 21-Jul-2021, at 02:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
In general, considering that (most of?) the existing behavior has stood
for decades, I think we need to tread VERY carefully about changing it.
I don't want to see this patch changing any case that is not indisputably
broken.
It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:
select interval '-1.7 years'; -- -1 years -8 mons
select interval '29.4 months'; -- 2 years 5 mons 12 days
select interval '-1.7 years 29.4 months'; -- 8 mons 12 days << wrong
select interval '29.4 months -1.7 years'; -- 9 mons 12 daysselect interval '-1.7 years' + interval '29.4 months'; -- 9 mons 12 days
select interval '29.4 months' + interval '-1.7 years'; -- 9 mons 12 days
The consensus was that the outcome that I flagged with “wrong” does indeed have that status.
Yeah, I think it's self-evident that your last four cases should
produce the same results. Whether '9 mons 12 days' is the best
possible result is debatable --- in a perfect world, maybe we'd
produce '9 mons' exactly --- but given that the first two cases
produce what they do, that does seem self-consistent. I think
we should be setting out to fix that outlier without causing
any of the other five results to change.
regards, tom lane
On 21-Jul-2021, at 01:23, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Wed, 21 Jul 2021 at 03:48, Bruce Momjian <bruce@momjian.us> wrote:
this example now gives me concern:
SELECT INTERVAL '1.06 months 1 hour';
interval
-----------------------
1 mon 2 days 01:00:00Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
spilling to hours/minutes/seconds, even though hours is already
specified. I don't see a better way to handle this than the current
code already does, but it is something odd.Hmm, looking at this whole thread, I have to say that I prefer the old
behaviour of spilling down to lower units.For example, with this patch:
SELECT '0.5 weeks'::interval;
interval
----------
4 dayswhich I don't think is really an improvement. My expectation is that
half a week is 3.5 days, and I prefer what it used to return, namely
'3 days 12:00:00'.It's true that that leads to odd-looking results when the field value
has lots of fractional digits, but that was at least explainable, and
followed the documentation.Looking for a general principle to follow, how about this -- the
result of specifying a fractional value should be the same as
multiplying an interval of 1 unit by that value. In other words,
'1.8594 months'::interval should be the same as '1 month'::interval *
1.8594. (Actually, it probably can't easily be made exactly the same
in all cases, due to differences in the floating point computations in
the two cases, and rounding errors, but it's hopefully not far off,
unlike the results obtained by not spilling down to lower units on
input.)The cases that are broken in master, in my opinion, are the larger
units (year and above), which don't propagate down in the same way as
fractional months and below. So, for example, '0.7 years' should be
8.4 months (with the conversion factor of 1 year = 12 months), giving
'8 months 12 days', which is what '1 year'::interval * 0.7 produces.
Sure, there are arguably more accurate ways of computing that.
However, that's the result obtained using the documented conversion
factors, so it's justifiable in those terms.It's worth noting another case that is broken in master:
SELECT '1.7 decades'::interval;
interval
------------------
16 years 11 monswhich is surely not what anyone would expect. The current patch fixes
this, but it would also be fixed by handling the fractional digits for
these units in the same way as for smaller units. There was an earlier
patch doing that, I think, though I didn't test it.Regards,
Dean
And try these two tests. (I’m using Version 13.3.) on current MacOS.
select
'1.7 decades'::interval as i1,
('1 decades'::interval)*1.7 as i2,
('10 years'::interval)*1.7 as i3;
i1 | i2 | i3
------------------+----------+----------
16 years 11 mons | 17 years | 17 years
select
'1.7345 decades'::interval as i4,
('1 decades'::interval)*1.7345 as i5,
('10 years'::interval)*1.7345 as i6;
i4 | i5 | i6
-----------------+---------------------------------+---------------------------------
17 years 4 mons | 17 years 4 mons 4 days 04:48:00 | 17 years 4 mons 4 days 04:48:00
Shows only what we know already: mixed interval arithmetic is fishy.
Seems to me that units like “weeks”, “centuries”, “millennia”, and so on are a solution (broken in some cases) looking for a problem. Try this (and variants like I showed above):
select
'1.7345 millennia'::interval as i7,
'1.7345 centuries'::interval as i8,
'1.7345 weeks'::interval as i9;
i7 | i8 | i9
-------------------+------------------+--------------------
1734 years 6 mons | 173 years 5 mons | 12 days 03:23:45.6
On Wed, Jul 21, 2021 at 01:29:49PM -0400, Tom Lane wrote:
Bryn Llewellyn <bryn@yugabyte.com> writes:
It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:
select interval '-1.7 years'; -- -1 years -8 mons
select interval '29.4 months'; -- 2 years 5 mons 12 days
select interval '-1.7 years 29.4 months'; -- 8 mons 12 days << wrong
select interval '29.4 months -1.7 years'; -- 9 mons 12 daysselect interval '-1.7 years' + interval '29.4 months'; -- 9 mons 12 days
select interval '29.4 months' + interval '-1.7 years'; -- 9 mons 12 daysThe consensus was that the outcome that I flagged with “wrong” does indeed have that status.
Yeah, I think it's self-evident that your last four cases should
produce the same results. Whether '9 mons 12 days' is the best
possible result is debatable --- in a perfect world, maybe we'd
produce '9 mons' exactly --- but given that the first two cases
produce what they do, that does seem self-consistent. I think
we should be setting out to fix that outlier without causing
any of the other five results to change.
OK, I decided to reverse some of the changes I was proposing once I
started to think about the inaccuracy of not spilling down from 'weeks'
to seconds when hours also appear. The fundamental issue is that the
months-to-days conversion is almost always an approximation, while the
days to seconds conversion is almost always accurate. This means we are
never going to have consistent spill-down that is useful.
Therefore, I went ahead and accepted that years and larger units spill
only to months, months spill only to days, and weeks and lower spill all
the way down to seconds. I also spelled this out in the docs, and
explained why we have this behavior.
Also, with my patch, the last four queries return the same result
because of the proper rounding also added by the patch, attached.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index e016f96fb4..e831242bf6 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2800,15 +2800,18 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
</para>
<para>
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- <literal>'1.5 week'</literal> or <literal>'01:02:03.45'</literal>. Such input is
- converted to the appropriate number of months, days, and seconds
- for storage. When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, <literal>'1.5 month'</literal> becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts; for example, <literal>'1.5
+ weeks'</literal> or <literal>'01:02:03.45'</literal>. However,
+ because interval internally stores only three integer units (months,
+ days, seconds), fractional units must be spilled to smaller units.
+ For example, because months are approximated to equal 30 days,
+ fractional values of units greater than months is rounded to be the
+ nearest integer number of months. Fractional months are rounded to
+ be the nearest integer number of days, assuming 24 hours per day.
+ Fractional units smaller than months are computed to the nearest
+ second. For example, <literal>'1.5 months'</literal> becomes
+ <literal>1 month 15 days</literal>. Only seconds will ever be shown
+ as fractional on output.
</para>
<para>
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 54ae632de2..5551102447 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3300,35 +3300,43 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_MONTH:
tm->tm_mon += val;
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+ /*
+ * The DAYS_PER_MONTH is an estimate so just round
+ * to days, rather than spilling to seconds.
+ */
+ /* round to a full month? */
+ if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
+ tm->tm_mon++;
+ else
+ tm->tm_mday += rint(fval * DAYS_PER_MONTH);
tmask = DTK_M(MONTH);
break;
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,11 +3573,15 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+ /* round to a full month? */
+ if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
+ tm->tm_mon++;
+ else
+ tm->tm_mday += rint(fval * DAYS_PER_MONTH);
break;
case 'W':
tm->tm_mday += val * 7;
@@ -3601,7 +3613,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -3615,7 +3627,11 @@ DecodeISO8601Interval(char *str,
if (dterr)
return dterr;
tm->tm_mon += val;
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+ /* round to a full month? */
+ if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
+ tm->tm_mon++;
+ else
+ tm->tm_mday += rint(fval * DAYS_PER_MONTH);
if (*str == '\0')
return 0;
if (*str == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 02b3c47223..1a1cb51945 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,11 +155,15 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+ /* round to a full month? */
+ if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
+ tm->tm_mon++;
+ else
+ tm->tm_mday += rint(fval * DAYS_PER_MONTH);
break;
case 'W':
tm->tm_mday += val * 7;
@@ -191,7 +195,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -205,7 +209,11 @@ DecodeISO8601Interval(char *str,
if (dterr)
return dterr;
tm->tm_mon += val;
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+ /* round to a full month? */
+ if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
+ tm->tm_mon++;
+ else
+ tm->tm_mday += rint(fval * DAYS_PER_MONTH);
if (*str == '\0')
return 0;
if (*str == 'T')
@@ -522,35 +530,43 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_MONTH:
tm->tm_mon += val;
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+ /*
+ * The DAYS_PER_MONTH is an estimate so just round
+ * to days, rather than spilling to seconds.
+ */
+ /* round to a full month? */
+ if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
+ tm->tm_mon++;
+ else
+ tm->tm_mday += rint(fval * DAYS_PER_MONTH);
tmask = DTK_M(MONTH);
break;
case DTK_YEAR:
tm->tm_year += val;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index e4b1246f45..eae18da946 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -46,6 +46,18 @@ SELECT INTERVAL '1.5 months' AS "One month 15 days";
1 mon 15 days
(1 row)
+SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";
+ One mon 5 days one hour
+-------------------------
+ 1 mon 5 days 01:00:00
+(1 row)
+
+SELECT INTERVAL '1.99 months' AS "Almost two months";
+ Almost two months
+-------------------
+ 2 mons
+(1 row)
+
SELECT INTERVAL '10 years -11 month -12 days +13:14' AS "9 years...";
9 years...
----------------------------------
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 6d532398bd..5a6c8924fe 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -13,6 +13,8 @@ SELECT INTERVAL '-1 +02:03' AS "22 hours ago...";
SELECT INTERVAL '-1 days +02:03' AS "22 hours ago...";
SELECT INTERVAL '1.5 weeks' AS "Ten days twelve hours";
SELECT INTERVAL '1.5 months' AS "One month 15 days";
+SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";
+SELECT INTERVAL '1.99 months' AS "Almost two months";
SELECT INTERVAL '10 years -11 month -12 days +13:14' AS "9 years...";
CREATE TABLE INTERVAL_TBL (f1 interval);
On 21-Jul-2021, at 17:07, Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 21, 2021 at 01:29:49PM -0400, Tom Lane wrote:
Bryn Llewellyn <bryn@yugabyte.com> writes:
It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:
select interval '-1.7 years'; -- -1 years -8 mons
select interval '29.4 months'; -- 2 years 5 mons 12 days
select interval '-1.7 years 29.4 months'; -- 8 mons 12 days << wrong
select interval '29.4 months -1.7 years'; -- 9 mons 12 daysselect interval '-1.7 years' + interval '29.4 months'; -- 9 mons 12 days
select interval '29.4 months' + interval '-1.7 years'; -- 9 mons 12 daysThe consensus was that the outcome that I flagged with “wrong” does indeed have that status.
Yeah, I think it's self-evident that your last four cases should
produce the same results. Whether '9 mons 12 days' is the best
possible result is debatable --- in a perfect world, maybe we'd
produce '9 mons' exactly --- but given that the first two cases
produce what they do, that does seem self-consistent. I think
we should be setting out to fix that outlier without causing
any of the other five results to change.OK, I decided to reverse some of the changes I was proposing once I
started to think about the inaccuracy of not spilling down from 'weeks'
to seconds when hours also appear. The fundamental issue is that the
months-to-days conversion is almost always an approximation, while the
days to seconds conversion is almost always accurate. This means we are
never going to have consistent spill-down that is useful.Therefore, I went ahead and accepted that years and larger units spill
only to months, months spill only to days, and weeks and lower spill all
the way down to seconds. I also spelled this out in the docs, and
explained why we have this behavior.Also, with my patch, the last four queries return the same result
because of the proper rounding also added by the patch, attached.
Your statement
“months-to-days conversion is almost always an approximation, while the days to seconds conversion is almost always accurate.”
is misleading. Any conversion like these (and also the “spill up” conversions that the justify_hours(), justify_days(), and justify_interval() built-in functions bring) are semantically dangerous because of the different rules for adding a pure months, a pure days, or a pure seconds interval to a timestamptz value.
Unless you avoid mixed interval values, then it’s so hard (even though it is possible) to predict the outcomes of interval arithmetic. Rather, all you get is emergent behavior that I fail to see can be relied upon in deliberately designed application code. Here’s a telling example:
set timezone = 'America/Los_Angeles';
with
c as (
select
'2021-03-13 19:00:00 America/Los_Angeles'::timestamptz as d,
'25 hours'::interval as i)
select
d + i as "d + i",
d + justify_hours(i) as "d + justify_hours(i)"
from c;
This is the result:
d + i | d + justify_hours(i)
------------------------+------------------------
2021-03-14 21:00:00-07 | 2021-03-14 20:00:00-07
The two results are different, even though the native equality test shows that the two different interval values are the same:
with
c as (select '25 hours'::interval as i)
select (i = justify_hours(i))::text
from c;
The result is TRUE.
The only route to sanity is to use only pure interval values (i.e. where only one of the fields of the internal [mm, dd, ss] representation is non-zero.
I mentioned that you can use a set of three domain types to enforce your intended practice here.
In other words, by programming application code defensively, it’s possible to insulate oneself entirely from the emergent behavior of the decades old PG code that implements the unconstrained native interval functionality and that brings what can only be considered to be unpredictable results.
Moreover, this defensive approach insulates you from any changes that Bruce’s patch might make.
On Wed, Jul 21, 2021 at 06:39:26PM -0700, Bryn Llewellyn wrote:
Your statement
“months-to-days conversion is almost always an approximation, while the
days to seconds conversion is almost always accurate.”is misleading. Any conversion like these (and also the “spill up” conversions
that the justify_hours(), justify_days(), and justify_interval() built-in
functions bring) are semantically dangerous because of the different rules for
adding a pure months, a pure days, or a pure seconds interval to a timestamptz
value.
We are trying to get the most reasonable output for fractional values
--- I stand by my statements.
Unless you avoid mixed interval values, then it’s so hard (even though it is
possible) to predict the outcomes of interval arithmetic. Rather, all you get
is emergent behavior that I fail to see can be relied upon in deliberately
designed application code. Here’s a telling example:
The point is that we will get unusual values, so we should do the best
we can.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Wed, Jul 21, 2021 at 6:43 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 21, 2021 at 06:39:26PM -0700, Bryn Llewellyn wrote:
Your statement
“months-to-days conversion is almost always an approximation, while
the
days to seconds conversion is almost always accurate.”
is misleading. Any conversion like these (and also the “spill up”
conversions
that the justify_hours(), justify_days(), and justify_interval() built-in
functions bring) are semantically dangerous because of the differentrules for
adding a pure months, a pure days, or a pure seconds interval to a
timestamptz
value.
We are trying to get the most reasonable output for fractional values --- I stand by my statements.Unless you avoid mixed interval values, then it’s so hard (even though
it is
possible) to predict the outcomes of interval arithmetic. Rather, all
you get
is emergent behavior that I fail to see can be relied upon in
deliberately
designed application code. Here’s a telling example:
The point is that we will get unusual values, so we should do the best
we can.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Hi,
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
Should the handling for year use the same check as that for month ?
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+ /* round to a full month? */
+ if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
Cheers
On Thu, Jul 22, 2021 at 2:59 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Wed, Jul 21, 2021 at 6:43 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 21, 2021 at 06:39:26PM -0700, Bryn Llewellyn wrote:
Your statement
“months-to-days conversion is almost always an approximation, while
the
days to seconds conversion is almost always accurate.”
is misleading. Any conversion like these (and also the “spill up”
conversions
that the justify_hours(), justify_days(), and justify_interval()
built-in
functions bring) are semantically dangerous because of the different
rules for
adding a pure months, a pure days, or a pure seconds interval to a
timestamptz
value.
We are trying to get the most reasonable output for fractional values --- I stand by my statements.Unless you avoid mixed interval values, then it’s so hard (even though
it is
possible) to predict the outcomes of interval arithmetic. Rather, all
you get
is emergent behavior that I fail to see can be relied upon in
deliberately
designed application code. Here’s a telling example:
The point is that we will get unusual values, so we should do the best
we can.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
Hi,
- tm->tm_mon += (fval * MONTHS_PER_YEAR); + tm->tm_mon += rint(fval * MONTHS_PER_YEAR);Should the handling for year use the same check as that for month ?
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH); + /* round to a full month? */ + if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)Cheers
Hi,
I guess the reason for current patch was that year to months conversion is
accurate.
On the new test:
+SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";
0.16 * 31 = 4.96 < 5
I wonder why 5 days were chosen in the test output.
On Thu, Jul 22, 2021 at 03:17:52PM -0700, Zhihong Yu wrote:
On Thu, Jul 22, 2021 at 2:59 PM Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,- tm->tm_mon += (fval * MONTHS_PER_YEAR); + tm->tm_mon += rint(fval * MONTHS_PER_YEAR);Should the handling for year use the same check as that for month ?
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH); + /* round to a full month? */ + if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)Cheers
Hi,
I guess the reason for current patch was that year to months conversion is
accurate.
Our internal units are hours/days/seconds, so the spill _up_ from months
to years happens automatically:
SELECT INTERVAL '23.99 months';
interval
----------
2 years
On the new test:
+SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";
0.16 * 31 = 4.96 < 5
I wonder why 5 days were chosen in the test output.
We use 30 days/month, not 31. However, I think you are missing the
changes in the patch and I am just understanding them fully now. There
are two big changes:
1. The amount of spill from months only to days
2. The _rounding_ of the result once we stop spilling at months or days
#2 is the part I think you missed.
One thing missing from my previous patch was the handling of negative
units, which is now handled properly in the attached patch:
SELECT INTERVAL '-1.99 years';
interval
----------
-2 years
SELECT INTERVAL '-1.99 months';
interval
----------
-2 mons
I ended up creating a function to handle this, which allowed me to
simplify some of the surrounding code.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
On 23-Jul-2021, at 08:05, Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Jul 22, 2021 at 03:17:52PM -0700, Zhihong Yu wrote:
On Thu, Jul 22, 2021 at 2:59 PM Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,- tm->tm_mon += (fval * MONTHS_PER_YEAR); + tm->tm_mon += rint(fval * MONTHS_PER_YEAR);Should the handling for year use the same check as that for month ?
- AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH); + /* round to a full month? */ + if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)Cheers
Hi,
I guess the reason for current patch was that year to months conversion is
accurate.Our internal units are hours/days/seconds, so the spill _up_ from months
to years happens automatically:SELECT INTERVAL '23.99 months';
interval
----------
2 yearsOn the new test:
+SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";
0.16 * 31 = 4.96 < 5
I wonder why 5 days were chosen in the test output.
We use 30 days/month, not 31. However, I think you are missing the
changes in the patch and I am just understanding them fully now. There
are two big changes:1. The amount of spill from months only to days
2. The _rounding_ of the result once we stop spilling at months or days#2 is the part I think you missed.
One thing missing from my previous patch was the handling of negative
units, which is now handled properly in the attached patch:SELECT INTERVAL '-1.99 years';
interval
----------
-2 yearsSELECT INTERVAL '-1.99 months';
interval
----------
-2 monsI ended up creating a function to handle this, which allowed me to
simplify some of the surrounding code.--
Bruce Momjian <bruce@momjian.us> https://www.google.com/url?q=https://momjian.us&source=gmail-imap&ust=1627657554000000&usg=AOvVaw2pMx7QBd3qSjHK1L9oUnl0
EDB https://www.google.com/url?q=https://enterprisedb.com&source=gmail-imap&ust=1627657554000000&usg=AOvVaw2Q92apfhXmqqFYz7aN16YLIf only the physical world exists, free will is an illusion.
<interval.diff.gz>
Will the same new spilldown rules hold in the same way for interval multiplication and division as they will for the interpretation of an interval literal?
The semantics here are (at least as far as my limited search skills have shown me) simply undocumented. But my tests in 13.3 have to date not disproved this hypothesis:
* considering "new_i ◄— i * f"
* # notice that the internal representation is _months_, days, and seconds at odds with "Our internal units are hours/days/seconds,"
* let i’s internal representation be [mm, dd, ss]
* new_i’s “intermediate” internal representation is [mm*f, dd*f, ss*f]
* input these values to the same spilldown algorithm that is applied when these same intermediate values are used in an interval literal
* so the result is [new_mm, new_dd, new_ss]
Here’s an example:
select
'1.2345 months 1.2345 days 1.2345 seconds'::interval =
'1 month 1 day 1 second'::interval*1.2345;
In 13.3, the result is TRUE. (I know that this doesn’t guarantee that the internal representations of the two compared interval values are the same. But it’s a necessary condition for the outcome that I’m referring to and serves to indecate the pont I’m making. A more careful test can be made.
On Fri, Jul 23, 2021 at 10:55:11AM -0700, Bryn Llewellyn wrote:
SELECT
'1.2345 months 1.2345 days 1.2345 seconds'::interval =
'1 month 1 day 1 second'::interval*1.2345;In 13.3, the result is TRUE. (I know that this doesn’t guarantee that the
internal representations of the two compared interval values are the same. But
it’s a necessary condition for the outcome that I’m referring to and serves to
indecate the pont I’m making. A more careful test can be made.
So you are saying fractional unit output should match multiplication
output? It doesn't now for all units:
SELECT interval '1.3443 years';
interval
---------------
1 year 4 mons
SELECT interval '1 years' * 1.3443;
?column?
---------------------------------
1 year 4 mons 3 days 22:45:07.2
It is true this patch is further reducing that matching. Do people
think I should make them match as part of this patch?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On 23-Jul-2021, bruce@momjian.us wrote:
On Fri, Jul 23, 2021 at 10:55:11AM -0700, Bryn Llewellyn wrote:
SELECT
'1.2345 months 1.2345 days 1.2345 seconds'::interval =
'1 month 1 day 1 second'::interval*1.2345;In 13.3, the result is TRUE. (I know that this doesn’t guarantee that the
internal representations of the two compared interval values are the same. But
it’s a necessary condition for the outcome that I’m referring to and serves to
indecate the pont I’m making. A more careful test can be made.So you are saying fractional unit output should match multiplication
output? It doesn't now for all units:SELECT interval '1.3443 years';
interval
---------------
1 year 4 monsSELECT interval '1 years' * 1.3443;
?column?
---------------------------------
1 year 4 mons 3 days 22:45:07.2It is true this patch is further reducing that matching. Do people
think I should make them match as part of this patch?
Summary:
--------
It seems to me that the best thing to do is fix the coarse bug about which there is unanimous agreement and to leave everything else (quirks and all) untouched.
Detail:
-------
My previous email (to which Bruce replies here) was muddled. Sorry. The challenge is that are a number of mechanisms at work. Their effects are conflated. And it’s very hard to unscramble them.
The underlying problem is that the internal representation of an interval value is a [mm, dd, ss] tuple. This fact is documented. The representation is key to understanding the definitions of these operations:
— defining an interval value from a text literal that uses real number values for its various fields.
— defining an interval value from make_interval(). (This one is easy because the API requires integral values for all but the seconds argument. It would be interesting to know why this asymmetrical definition was implemented. It seems to imply that somebody though that spilldown was a bad idea and should be prevented before it might happen.)
— creating the text typecast of an extant interval value.
— creating an interval value by adding/subtracting an extant interval value to/from another
— creating an interval value by multiplying or dividing an extant interval value by a (real) number
— creating an interval value by subtracting a pair of moments of the same data type (timestamptz, plain timestamp, or time)
— creating a new moment value by adding or subtracting an extant interval value to an extant moment value of the same data type.
— creating an interval value by applying justify_hours(i), justify_days(i), and justify_interval(i) to an extant interval value, i.
— creating a double precision value by applying extract(epoch from i)
to an extant interval value, i.
— evaluating inequality and equality tests to compare two extant interval values.
Notice that, for example, this test:
select
interval '1.3443 years' as i1,
interval '1 years' * 1.3443 as i2;
conflates three things: converting an interval literal to a [mm, dd, ss] tuple; multiplying a [mm, dd, ss] tuple by a real number; and converting a [mm, dd, ss] tuple to a text representation. Similarly, this test:
select
interval '1.3443 years' <
interval '1 years' * 1.3443;
conflates three things: converting an interval literal to a [mm, dd, ss] tuple; multiplying a [mm, dd, ss] tuple by a real number; and inequality comparison of two [mm, dd, ss] two [mm, dd, ss] tuples.
As far as I’ve been able, the PG documentation doesn’t do a good job of defining the semantics of any of these operations. Some (like the “justify” functions” are sketched reasonably well. Others, like interval multiplication, are entirely undefined.
This makes discussion of simple test like the two I showed immediately above hard. It also makes any discussion of correctness, possible bugs, and proposed implementation changes very difficult.
Further, it also makes it hard to see how tests for application code that uses any of these operations can be designed. The normal approach relies on testing that you get what you expect. But here, you don't know what to expect—unless (as I’ve done) you first assert hypotheses for the undefined operations and test them with programmed simulations. Of course, this is, in general, an unreliable approach. The only way to know what code is intended to do is to read the prose specification that informs the implementation.
I had forgotten one piece of the long history of this thread. Soon after I presented the testcase that folks agree shows a clear bug, I asked about the rules for creating the internal [mm, dd, ss] tuple from a text literal that uses real numbers for the fields. My tests showed two things: (1) an intuitively clear model for the spilldown of nonintegral months to days and then, in turn, of nonintegral days to seconds; and (2) a quirky rule for deriving intermediate months from fractional years and fractional months before then using the more obvious rules to spill to days. (This defies terse expression in prose. I copied my PL/pgSQL implementation below.)
There was initially some discussion about changing implementation o the spill-down from [years, months] in the interval literal to the ultimate [mm, dd, ss] representation. This is what's Bruces is asking about. And it's what I was muddled about.
As I’ve said, my conclusion is that the only safe approach is to create and use only “pure” interval values (where just one of the internal fields is non-zero). For this reason (and having seen what I decided was the impossibly unmemorable rules that my modeled implementation uses) I didn’t look at the rules for the other fields that the interval literal allows (weeks, centuries, millennia, and so on).
--------------------------------------------------------------------------------
mm_trunc constant int not null := trunc(p.mm);
mm_remainder constant double precision not null := p.mm - mm_trunc::double precision;
-- This is a quirk.
mm_out constant int not null := trunc(p.yy*mm_per_yy) + mm_trunc;
dd_real_from_mm constant double precision not null := mm_remainder*dd_per_mm;
dd_int_from_mm constant int not null := trunc(dd_real_from_mm);
dd_remainder_from_mm constant double precision not null := dd_real_from_mm - dd_int_from_mm::double precision;
dd_int_from_user constant int not null := trunc(p.dd);
dd_remainder_from_user constant double precision not null := p.dd - dd_int_from_user::double precision;
dd_out constant int not null := dd_int_from_mm + dd_int_from_user;
d_remainder constant double precision not null := dd_remainder_from_mm + dd_remainder_from_user;
ss_out constant double precision not null := d_remainder*ss_per_dd +
p.hh*ss_per_hh +
p.mi*ss_per_mi +
p.ss;
--------------------------------------------------------------------------------
On Sun, Jul 25, 2021 at 11:56:54AM -0700, Bryn Llewellyn wrote:
As far as I’ve been able, the PG documentation doesn’t do a good job of
defining the semantics of any of these operations. Some (like the “justify”
This is because fractional interval values are not used or asked about
often.
functions” are sketched reasonably well. Others, like interval multiplication,
are entirely undefined.
Yes, the “justify” functions were requested and implemented because they
met a frequently-requested need unrelated to fractional values, though
they do have spill-up uses.
This makes discussion of simple test like the two I showed immediately above
hard. It also makes any discussion of correctness, possible bugs, and proposed
implementation changes very difficult.
Agreed. With fractional values an edge use-case, we are trying to find
the most useful implementation.
As I’ve said, my conclusion is that the only safe approach is to create and use
only “pure” interval values (where just one of the internal fields is
non-zero). For this reason (and having seen what I decided was the impossibly
unmemorable rules that my modeled implementation uses) I didn’t look at the
rules for the other fields that the interval literal allows (weeks, centuries,
millennia, and so on).
I think the current page is clear about _specifying_ fractional units,
but you are right that multiplication/division of fractional values is
not covered.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Fri, Apr 2, 2021 at 09:02:29PM -0400, Bruce Momjian wrote:
On Fri, Apr 2, 2021 at 05:50:59PM -0700, Bryn Llewellyn wrote:
are the user’s parameterization. All are real numbers. Because non-integral
values for years, months, days, hours, and minutes are allowed when you specify
a value using the ::interval typecast, my reference doc must state the rules. I
would have struggled to express these rules in prose—especially given the use
both of trunc() and floor(). I would have struggled more to explain what
requirements these rules meet.The fundamental issue is that while months, days, and seconds are
consistent in their own units, when you have to cross from one unit to
another, it is by definition imprecise, since the interval is not tied
to a specific date, with its own days-of-the-month and leap days and
daylight savings time changes. It feels like it is going to be
imprecise no matter what we do.Adding to this is the fact that interval values are stored in C 'struct
tm' defined in libc's ctime(), where months are integers, so carrying
around non-integer month values until we get a final result would add a
lot of complexity, and complexity to a system that is by definition
imprecise, which doesn't seem worth it.
I went ahead and modified the interval multiplication/division functions
to use the same logic as fractional interval units:
SELECT interval '23 mons';
interval
----------------
1 year 11 mons
SELECT interval '23 mons' / 2;
?column?
-----------------
11 mons 15 days
SELECT interval '23.5 mons';
interval
------------------------
1 year 11 mons 15 days
SELECT interval '23.5 mons' / 2;
?column?
--------------------------
11 mons 22 days 12:00:00
I think the big issue is that the casting to interval into integer
mons/days/secs so we can no longer make the distinction of units >
months vs months.
Using Bryn's example, the master branch output is:
SELECT
interval '1.3443 years' as i1,
interval '1 years' * 1.3443 as i2;
i1 | i2
---------------+---------------------------------
1 year 4 mons | 1 year 4 mons 3 days 22:45:07.2
and the attached patch output is:
SELECT
interval '1.3443 years' as i1,
interval '1 years' * 1.3443 as i2;
i1 | i2
---------------+----------------------
1 year 4 mons | 1 year 4 mons 4 days
which looks like an improvement.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
Bruce Momjian <bruce@momjian.us> writes:
I went ahead and modified the interval multiplication/division functions
to use the same logic as fractional interval units:
Wait. A. Minute.
What I think we have consensus on is that interval_in is doing the
wrong thing in a particular corner case. I have heard nobody but
you suggesting that we should start undertaking behavioral changes
in other interval functions, and I don't believe that that's a good
road to start going down. These behaviors have stood for many years.
Moreover, since the whole thing is by definition operating with
inadequate information, it is inevitable that for every case you
make better there will be another one you make worse.
I'm really not on board with changing anything except interval_in,
and even there, we had better be certain that everything we change
is a case that is certainly being made better.
BTW, please do not post patches as gzipped attachments, unless
they're enormous. You're just adding another step making it
harder for people to look at them.
regards, tom lane
On Tue, Jul 27, 2021 at 04:01:54PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
I went ahead and modified the interval multiplication/division functions
to use the same logic as fractional interval units:Wait. A. Minute.
What I think we have consensus on is that interval_in is doing the
wrong thing in a particular corner case. I have heard nobody but
you suggesting that we should start undertaking behavioral changes
in other interval functions, and I don't believe that that's a good
road to start going down. These behaviors have stood for many years.
Moreover, since the whole thing is by definition operating with
inadequate information, it is inevitable that for every case you
make better there will be another one you make worse.
Bryn mentioned this so I thought I would see what the result looks like.
I am fine to skip them.
I'm really not on board with changing anything except interval_in,
and even there, we had better be certain that everything we change
is a case that is certainly being made better.
Well, I think what I had before the multiply/divide changes were
acceptable to everyone except Bryn, who was looking for more
consistency.
BTW, please do not post patches as gzipped attachments, unless
they're enormous. You're just adding another step making it
harder for people to look at them.
OK, what is large for you? 100k bytes? I was using 10k bytes.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On 27-Jul-2021, at 14:13, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 27, 2021 at 04:01:54PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
I went ahead and modified the interval multiplication/division functions
to use the same logic as fractional interval units:Wait. A. Minute.
What I think we have consensus on is that interval_in is doing the
wrong thing in a particular corner case. I have heard nobody but
you suggesting that we should start undertaking behavioral changes
in other interval functions, and I don't believe that that's a good
road to start going down. These behaviors have stood for many years.
Moreover, since the whole thing is by definition operating with
inadequate information, it is inevitable that for every case you
make better there will be another one you make worse.Bryn mentioned this so I thought I would see what the result looks like.
I am fine to skip them.I'm really not on board with changing anything except interval_in,
and even there, we had better be certain that everything we change
is a case that is certainly being made better.Well, I think what I had before the multiply/divide changes were
acceptable to everyone except Bryn, who was looking for more
consistency.BTW, please do not post patches as gzipped attachments, unless
they're enormous. You're just adding another step making it
harder for people to look at them.OK, what is large for you? 100k bytes? I was using 10k bytes.
Before I say anything else, I’ll stress what I wrote recently (under the heading “summary”). I support Tom’s idea that the only appropriate change to make is to fix only the exactly self-evident bug that I reported at the start of this thread.
I fear that Bruce doesn’t understand my point about interval multiplication (which includes multiplying by a number whose absolute value lies between 0 and 1). Here it is. I believe that the semantics are (and should be) defined like this:
[mm, dd, ss]*n == post_spilldown([mm*n, dd*n, ss*n])
where the function post_spilldown() applies the rules that are used when an interval literal that specifies only values for months, days, and seconds is converted to the internal [mm, dd, ss] representation—where mm and dd are 4-byte integers and ss is an 80byte integer that represents microseconds.
Here’s a simple test that’s consistent with that hypothesis:
with
c1 as (
select
'1 month 1 day 1 second'::interval as i1,
'1.234 month 1.234 day 1.234 second'::interval as i3),
c2 as (
select i1*1.234 as i2, i3 from c1)
select i2::text as i2_txt, i3::text from c2 as i3_txt;
Here’s the result:
i2_txt | i3
---------------------------+---------------------------
1 mon 8 days 06:05:46.834 | 1 mon 8 days 06:05:46.834
So I’m so far happy.
But, like I said, I’d forgotten a orthogonal quirk. This test shows it. It’s informed by the fact that 1.2345*12.0 is 14.8140.
select
('1.2345 years' ::interval)::text as i1_txt,
('14.8140 months'::interval)::text as i2_txt;
Here’s the result:
i1_txt | i2_txt
---------------+--------------------------------
1 year 2 mons | 1 year 2 mons 24 days 10:04:48
It seems to be to be crazy behavior. I haven’t found any account of it in the PG docs. Others have argued that it’s a sensible result. Anyway, I don’t believe that I’ve ever argued that it’s a bug. I wanted only to know what rationale informed the design. I agree that changing the behavior here would be problematic for extant code.
This quirk explains the outcome of this test:
select
('1.2345 years'::interval)::text as i1_txt,
('14.8140 months'::interval)::text as i2_txt,
(1.2345*('1 years'::interval))::text as i3_txt;
This is the result:
i1_txt | i2_txt | i3_txt
---------------+--------------------------------+--------------------------------
1 year 2 mons | 1 year 2 mons 24 days 10:04:48 | 1 year 2 mons 24 days 10:04:48
Notice that the same text is reported for i2_txt as for i3_txt.
On Tue, Jul 27, 2021 at 3:36 PM Bryn Llewellyn <bryn@yugabyte.com> wrote:
with
c1 as (
select
'1 month 1 day 1 second'::interval as i1,
'1.234 month 1.234 day 1.234 second'::interval as i3),c2 as (
select i1*1.234 as i2, i3 from c1)select i2::text as i2_txt, i3::text from c2 as i3_txt;
It's nice to envision all forms of fancy calculations. But the fact is that
'1.5 month'::interval * 2 != '3 month"::interval
with any of these patches - and if that doesn't work - the rest of the
strange numbers really seem to be irrelevant.
If there is a desire to handle fractional cases - then all pieces need to
be held as provided until they are transformed into something. In other
words - 1.5 month needs to be held as 1.5 month until we ask for it to be
reduced to 1 month and 15 days at some point. If the interval data type
immediately casts 1.5 months to 1 month 15 days then all subsequent
calculations are going to be wrong.
I appreciate there is generally no way to accomplish this right now - but
that means walking away from things like 1 month * 1.234 as being not
calculable as opposed to trying to piece something together that fails
pretty quickly.
John
On Wed, 28 Jul 2021 at 00:08, John W Higgins <wishdev@gmail.com> wrote:
It's nice to envision all forms of fancy calculations. But the fact is that
'1.5 month'::interval * 2 != '3 month"::interval
That's not exactly true. Even without the patch:
SELECT '1.5 month'::interval * 2 AS product,
'3 month'::interval AS expected,
justify_interval('1.5 month'::interval * 2) AS justified_product,
'1.5 month'::interval * 2 = '3 month'::interval AS equal;
product | expected | justified_product | equal
----------------+----------+-------------------+-------
2 mons 30 days | 3 mons | 3 mons | t
(1 row)
So it's equal even without calling justify_interval() on the result.
FWIW, I remain of the opinion that the interval literal code should
just spill down to lower units in all cases, just like the
multiplication and division code, so that the results are consistent
(barring floating point rounding errors) and explainable.
Regards,
Dean
On Wed, Jul 28, 2021 at 12:42 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:
On Wed, 28 Jul 2021 at 00:08, John W Higgins <wishdev@gmail.com> wrote:
It's nice to envision all forms of fancy calculations. But the fact is
that
'1.5 month'::interval * 2 != '3 month"::interval
That's not exactly true. Even without the patch:
SELECT '1.5 month'::interval * 2 AS product,
'3 month'::interval AS expected,
justify_interval('1.5 month'::interval * 2) AS justified_product,
'1.5 month'::interval * 2 = '3 month'::interval AS equal;product | expected | justified_product | equal
----------------+----------+-------------------+-------
2 mons 30 days | 3 mons | 3 mons | t
(1 row)
That's viewing something via the mechanism that is incorrectly (technically
speaking) doing the work in the first place. It believes they are the same
- but they are clearly not when actually used.
select '1/1/2001'::date + (interval '3 month');
?column?
---------------------
2001-04-01 00:00:00
(1 row)
vs
select '1/1/2001'::date + (interval '1.5 month' * 2);
?column?
---------------------
2001-03-31 00:00:00
(1 row)
That's the flaw in this entire body of work - we keep taking fractional
amounts - doing round offs and then trying to add or multiply the pieces
back and ending up with weird floating point math style errors. That's
never to complain about it - but we shouldn't be looking at edge cases with
things like 1 month * 1.234 when 1.5 months * 2 doesn't work properly.
John
P.S. Finally we have items like this
select '12/1/2001'::date + (interval '1.5 months' * 2);
?column?
---------------------
2002-03-03 00:00:00
(1 row)
postgres=# select '1/1/2001'::date + (interval '1.5 months' * 2);
?column?
---------------------
2001-03-31 00:00:00
(1 row)
Which only has a 28 day gap because of the length of February - clearly
this is not working quite right.
On Wed, Jul 28, 2021 at 08:42:31AM +0100, Dean Rasheed wrote:
On Wed, 28 Jul 2021 at 00:08, John W Higgins <wishdev@gmail.com> wrote:
It's nice to envision all forms of fancy calculations. But the fact is that
'1.5 month'::interval * 2 != '3 month"::interval
That's not exactly true. Even without the patch:
SELECT '1.5 month'::interval * 2 AS product,
'3 month'::interval AS expected,
justify_interval('1.5 month'::interval * 2) AS justified_product,
'1.5 month'::interval * 2 = '3 month'::interval AS equal;product | expected | justified_product | equal
----------------+----------+-------------------+-------
2 mons 30 days | 3 mons | 3 mons | t
(1 row)So it's equal even without calling justify_interval() on the result.
FWIW, I remain of the opinion that the interval literal code should
just spill down to lower units in all cases, just like the
multiplication and division code, so that the results are consistent
(barring floating point rounding errors) and explainable.
Here is a more minimal patch that doesn't change the spill-down units at
all, but merely documents it, and changes the spilldown to months to
round instead of truncate.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 453115f942..bd938a675a 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2840,15 +2840,17 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
</para>
<para>
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- <literal>'1.5 week'</literal> or <literal>'01:02:03.45'</literal>. Such input is
- converted to the appropriate number of months, days, and seconds
- for storage. When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, <literal>'1.5 month'</literal> becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts; for example, <literal>'1.5
+ weeks'</literal> or <literal>'01:02:03.45'</literal>. However,
+ because interval internally stores only three integer units (months,
+ days, seconds), fractional units must be spilled to smaller units.
+ For example, because months are approximated to equal 30 days,
+ fractional values of units greater than months is rounded to be the
+ nearest integer number of months. Fractional units of months or less
+ are computed to be an integer number of days and seconds, assuming
+ 24 hours per day. For example, <literal>'1.5 months'</literal>
+ becomes <literal>1 month 15 days</literal>. Only seconds will ever
+ be shown as fractional on output.
</para>
<para>
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 54ae632de2..cb3fa85892 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3306,29 +3306,25 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3561,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3597,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 02b3c47223..a7e530cb5d 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,7 +155,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -191,7 +191,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -528,29 +528,25 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
On Tue, Jul 27, 2021 at 4:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
What I think we have consensus on is that interval_in is doing the
wrong thing in a particular corner case. I have heard nobody but
you suggesting that we should start undertaking behavioral changes
in other interval functions, and I don't believe that that's a good
road to start going down. These behaviors have stood for many years.
Moreover, since the whole thing is by definition operating with
inadequate information, it is inevitable that for every case you
make better there will be another one you make worse.
I agree that we need to be really conservative here. I think Tom is
right that if we start changing behaviors that "seem wrong," we will
probably make some things better and other things worse. The overall
amount of stuff that "seems wrong" will probably not go down, but a
lot of people's applications will break when they try to upgrade to
v15. That's not going to be a win overall.
I think a lot of the discussion on this thread consists of people
hoping for things that are not very realistic. The interval type
represents the number of months as an integer, and the number of days
as an integer. That means that an interval like '0.7 months' does not
really exist. If you ask for that interval what you get is actually 21
days, which is a reasonable approximation of 0.7 months but not the
same thing, except in April, June, September, and November. So when
you then say that you want 0.7 months + 0.3 months to equal 1.0
months, what you're really requesting is that 21 days + 9 days = 1
month. That system has been tried in the past, but it was abandoned
roughly around the time of Julius Caeser for the very good reason that
the orbital period of the earth about the sun is noticeably greater
than 360 days.
It would be entirely possible to design a data type that could
represent such values more exactly. A data type that had a
representation similar to interval but with double values for the
numbers of years and months would be able to compute 0.7 months + 0.3
months and get 1.0 months with no problem.
If we were doing this over again, I would argue that, with this
on-disk representation, 0.7 months ought to be rejected as invalid
input, because it's generally not a good idea to have a data type that
silently converts a value into a different value that is not
equivalent for all purposes. It is confusing and causes people to
expect behavior different from what they will actually get. Now, it
seems far too late to consider such a change at this point ... and it
is also no good considering a change to the on-disk representation of
the existing data type at this point ... but it is also no good
pretending like we have a floating-point representation of months and
days when we actually do not.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
If we were doing this over again, I would argue that, with this
on-disk representation, 0.7 months ought to be rejected as invalid
input, because it's generally not a good idea to have a data type that
silently converts a value into a different value that is not
equivalent for all purposes. It is confusing and causes people to
expect behavior different from what they will actually get. Now, it
seems far too late to consider such a change at this point ... and it
is also no good considering a change to the on-disk representation of
the existing data type at this point ... but it is also no good
pretending like we have a floating-point representation of months and
days when we actually do not.
You know, I was thinking exactly that thing earlier. Changing the
on-disk representation is certainly a nonstarter, but the problem
here stems from expecting interval_in to do something sane with inputs
that do not correspond to any representable value. I do not think we
have any other datatypes where we expect the input function to make
choices like that.
Is it really too late to say "that was a damfool idea" and drop fractional
years/months/etc from interval_in's lexicon? By definition, this will not
create any dump/reload problems, because interval_out will never emit any
such thing. It will break applications that are expecting such syntax to
do something sane. But that expectation is fundamentally not meetable,
so maybe we should just make a clean break. (Without back-patching it,
of course.)
I'm not entirely sure about whether to reject fractional days, though.
Converting those on the assumption of 1 day == 24 hours is not quite
theoretically right. But it's unsurprising, which is not something
we can say about fractions of the larger units. So maybe it's OK to
continue accepting that.
regards, tom lane
On Wed, Jul 28, 2021 at 11:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
You know, I was thinking exactly that thing earlier. Changing the
on-disk representation is certainly a nonstarter, but the problem
here stems from expecting interval_in to do something sane with inputs
that do not correspond to any representable value. I do not think we
have any other datatypes where we expect the input function to make
choices like that.
It's not exactly the same issue, but the input function whose behavior
most regularly trips people up is bytea, because they try something
like 'x'::bytea and it seems to DWTW so then they try it on all their
data and discover that, for example, '\'::bytea fails outright, or
that ''::bytea = '\x'::bytea, contrary to expectations. People often
seem to think that casting to bytea should work like convert_to(), but
it doesn't. As in the case at hand, byteain() has to guess whether the
input is intended to be the 'hex' or 'escape' format, and because the
'escape' format looks a lot like plain old text, confusion ensues.
Now, guessing between two input formats that are both legal for the
data type is not exactly the same as guessing what to do with a value
that's not directly representable, but it has the same ultimate effect
i.e. the human beings perceive the system as buggy.
A case that is perhaps more theoretically similar to the instance at
hand is rounding during the construction of floating point values. My
system thinks '1.00000000000000000000000001'::float = '1'::float, so
in that case, as in this one, we've decided that it's OK to build an
inexact representation of the input value. I don't really see what can
be done about this considering that the textual representation uses
base 10 and the internal representation uses base 2, but I think this
doesn't cause us as many problems in practice because people
understand how it works, which doesn't seem to be the case with the
interval data type, at last if this thread is any indication.
I am dubious that it's worth the pain of making the input function
reject cases involving fractional units. It's true that some people
here aren't happy with the current behavior, but they may no happier
if we reject those cases with an error, and other people may then be
unhappy too. I think your previous idea was the best one so far: fix
the input function so that 'X years Y months' and 'Y months X years'
always produce the same answer, and call it good.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Jul 28, 2021 at 11:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
You know, I was thinking exactly that thing earlier. Changing the
on-disk representation is certainly a nonstarter, but the problem
here stems from expecting interval_in to do something sane with inputs
that do not correspond to any representable value. I do not think we
have any other datatypes where we expect the input function to make
choices like that.
A case that is perhaps more theoretically similar to the instance at
hand is rounding during the construction of floating point values. My
system thinks '1.00000000000000000000000001'::float = '1'::float, so
in that case, as in this one, we've decided that it's OK to build an
inexact representation of the input value.
Fair point, but you decided when you chose to use float that you don't
care about the differences between numbers that only differ at the
seventeenth or so decimal place. (Maybe, if you don't understand what
float is, you didn't make that choice intentionally ... but that's
a documentation issue not a code shortcoming.) However, it's fairly
hard to believe that somebody who writes '1.4 years'::interval doesn't
care about the 0.4 year. The fact that we silently convert that to,
effectively, 1.33333333... years seems like a bigger roundoff error
than one would expect.
I am dubious that it's worth the pain of making the input function
reject cases involving fractional units. It's true that some people
here aren't happy with the current behavior, but they may no happier
if we reject those cases with an error, and other people may then be
unhappy too.
Maybe. A possible compromise is to accept only exactly-representable
fractions. Then, for instance, we'd take 1.5 years (resulting in 18
months) but not 1.4 years. Now, this might fall foul of your point about
not wanting to mislead people into expecting the system to do things it
can't; but I'd argue that the existing behavior misleads them much more.
I think your previous idea was the best one so far: fix
the input function so that 'X years Y months' and 'Y months X years'
always produce the same answer, and call it good.
That would clearly be a bug fix. I'm just troubled that there are
still behaviors that people will see as bugs.
regards, tom lane
On Wed, Jul 28, 2021 at 1:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Fair point, but you decided when you chose to use float that you don't
care about the differences between numbers that only differ at the
seventeenth or so decimal place. (Maybe, if you don't understand what
float is, you didn't make that choice intentionally ... but that's
a documentation issue not a code shortcoming.) However, it's fairly
hard to believe that somebody who writes '1.4 years'::interval doesn't
care about the 0.4 year. The fact that we silently convert that to,
effectively, 1.33333333... years seems like a bigger roundoff error
than one would expect.
Yeah, that's definitely a fair point!
I am dubious that it's worth the pain of making the input function
reject cases involving fractional units. It's true that some people
here aren't happy with the current behavior, but they may no happier
if we reject those cases with an error, and other people may then be
unhappy too.Maybe. A possible compromise is to accept only exactly-representable
fractions. Then, for instance, we'd take 1.5 years (resulting in 18
months) but not 1.4 years. Now, this might fall foul of your point about
not wanting to mislead people into expecting the system to do things it
can't; but I'd argue that the existing behavior misleads them much more.
Well, let's see what other people think.
I think your previous idea was the best one so far: fix
the input function so that 'X years Y months' and 'Y months X years'
always produce the same answer, and call it good.That would clearly be a bug fix. I'm just troubled that there are
still behaviors that people will see as bugs.
That's a reasonable thing to be troubled about, but the date and time
related datatypes have so many odd and crufty behaviors that I have a
hard time believing that there's another possible outcome. If somebody
showed up today and proposed a new data type and told us that the way
to format values of that data type was to say to_char(my_value,
alphabet_soup) I think they would not be taken very seriously. A lot
of this code, and the associated interfaces, date back to a time when
PostgreSQL was far more primitive than today, and when databases in
general were as well. At least we didn't end up with a datatype called
varchar2 ... or not yet, anyway.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Jul 28, 2021 at 1:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
That would clearly be a bug fix. I'm just troubled that there are
still behaviors that people will see as bugs.
That's a reasonable thing to be troubled about, but the date and time
related datatypes have so many odd and crufty behaviors that I have a
hard time believing that there's another possible outcome.
There's surely a ton of cruft there, but I think most of it stems from
western civilization's received rules for timekeeping, which we can do
little about. But the fact that interval_in accepts '1.4 years' when
it cannot do anything very reasonable with that input is entirely
self-inflicted damage.
BTW, I don't have a problem with the "interval * float8" operator
doing equally strange things, because if you don't like what it
does you can always write your own multiplication function that
you like better. There can be only one interval_in, though,
so I don't think it should be misrepresenting the fundamental
capabilities of the datatype.
regards, tom lane
On Wed, Jul 28, 2021 at 11:19:16AM -0400, Bruce Momjian wrote:
On Wed, Jul 28, 2021 at 08:42:31AM +0100, Dean Rasheed wrote:
So it's equal even without calling justify_interval() on the result.
FWIW, I remain of the opinion that the interval literal code should
just spill down to lower units in all cases, just like the
multiplication and division code, so that the results are consistent
(barring floating point rounding errors) and explainable.Here is a more minimal patch that doesn't change the spill-down units at
all, but merely documents it, and changes the spilldown to months to
round instead of truncate.
Unless I hear more feedback, I plan to apply this doc patch to all
branches with the word "rounded" changed to "truncated" in the back
branches, and apply the rounded code changes to master.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Fri, Jul 30, 2021 at 12:04:39PM -0400, Bruce Momjian wrote:
On Wed, Jul 28, 2021 at 11:19:16AM -0400, Bruce Momjian wrote:
On Wed, Jul 28, 2021 at 08:42:31AM +0100, Dean Rasheed wrote:
So it's equal even without calling justify_interval() on the result.
FWIW, I remain of the opinion that the interval literal code should
just spill down to lower units in all cases, just like the
multiplication and division code, so that the results are consistent
(barring floating point rounding errors) and explainable.Here is a more minimal patch that doesn't change the spill-down units at
all, but merely documents it, and changes the spilldown to months to
round instead of truncate.Unless I hear more feedback, I plan to apply this doc patch to all
branches with the word "rounded" changed to "truncated" in the back
branches, and apply the rounded code changes to master.
Now that I think of it, I will just remove the word "rounded" from the
back branch docs so we are technically breaking the documented API less
in PG 15.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Bruce Momjian <bruce@momjian.us> writes:
On Fri, Jul 30, 2021 at 12:04:39PM -0400, Bruce Momjian wrote:
Unless I hear more feedback, I plan to apply this doc patch to all
branches with the word "rounded" changed to "truncated" in the back
branches, and apply the rounded code changes to master.
Now that I think of it, I will just remove the word "rounded" from the
back branch docs so we are technically breaking the documented API less
in PG 15.
I think your first idea was better. Not documenting the behavior
doesn't make this not an API change; it just makes it harder for
people to understand what changed.
The doc patch itself is not exactly fine:
+ Field values can have fractional parts; for example, <literal>'1.5
+ weeks'</literal> or <literal>'01:02:03.45'</literal>. However,
I think "some field values", as it was worded previously, was better.
If you try to write 01.5:02:03, that is not going to be interpreted
as 1.5 hours. (Hmm, I get something that seems quite insane:
regression=# select '01.5:02:03'::interval;
interval
----------------
1 day 14:03:00
(1 row)
I wonder what it thinks it's doing there.)
This is wrong:
+ because interval internally stores only three integer units (months,
+ days, seconds), fractional units must be spilled to smaller units.
s/seconds/microseconds/ is probably enough to fix that.
+ For example, because months are approximated to equal 30 days,
+ fractional values of units greater than months is rounded to be the
+ nearest integer number of months. Fractional units of months or less
+ are computed to be an integer number of days and seconds, assuming
+ 24 hours per day. For example, <literal>'1.5 months'</literal>
+ becomes <literal>1 month 15 days</literal>.
This entire passage is vague, and grammatically shaky too. Perhaps
more like
Fractional parts of units larger than months are rounded to the
nearest integer number of months; for example '1.5 years'
becomes '1 year 6 mons'. Fractional parts of months are rounded
to the nearest integer number of days, using the assumption that
one month equals 30 days; for example '1.5 months'
becomes '1 mon 15 days'. Fractional parts of days and weeks
are converted to microseconds, using the assumption that one day
equals 24 hours.
On output, the months field is shown as an appropriate number of
years and months; the days field is shown as-is; the microseconds
field is converted to hours, minutes, and possibly-fractional
seconds.
regards, tom lane
On Fri, Jul 30, 2021 at 12:49:34PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Now that I think of it, I will just remove the word "rounded" from the
back branch docs so we are technically breaking the documented API less
in PG 15.I think your first idea was better. Not documenting the behavior
doesn't make this not an API change; it just makes it harder for
people to understand what changed.
OK. However, I thought we were more worried about changing documented
APIs than undocumented ones. Anyway, I will do as you suggested.
The doc patch itself is not exactly fine:
+ Field values can have fractional parts; for example, <literal>'1.5 + weeks'</literal> or <literal>'01:02:03.45'</literal>. However,I think "some field values", as it was worded previously, was better.
If you try to write 01.5:02:03, that is not going to be interpreted
as 1.5 hours. (Hmm, I get something that seems quite insane:regression=# select '01.5:02:03'::interval;
interval
----------------
1 day 14:03:00
(1 row)I wonder what it thinks it's doing there.)
It thinks 01.5:02:03 is Days:Hours;Minute, so I think all fields can use
fractions:
SELECT interval '1.5 minutes';
interval
----------
00:01:30
This is wrong:
+ because interval internally stores only three integer units (months, + days, seconds), fractional units must be spilled to smaller units.s/seconds/microseconds/ is probably enough to fix that.
OK, there were a few place that said "seconds" so I fixed those too.
+ For example, because months are approximated to equal 30 days, + fractional values of units greater than months is rounded to be the + nearest integer number of months. Fractional units of months or less + are computed to be an integer number of days and seconds, assuming + 24 hours per day. For example, <literal>'1.5 months'</literal> + becomes <literal>1 month 15 days</literal>.This entire passage is vague, and grammatically shaky too. Perhaps
more likeFractional parts of units larger than months are rounded to the
nearest integer number of months; for example '1.5 years'
becomes '1 year 6 mons'. Fractional parts of months are rounded
to the nearest integer number of days, using the assumption that
one month equals 30 days; for example '1.5 months'
The newest patch actually doesn't work as explained above --- fractional
months now continue to spill to microseconds. I think you are looking
at a previous version.
becomes '1 mon 15 days'. Fractional parts of days and weeks
are converted to microseconds, using the assumption that one day
equals 24 hours.
Uh, fractional weeks can be integer days.
On output, the months field is shown as an appropriate number of
years and months; the days field is shown as-is; the microseconds
field is converted to hours, minutes, and possibly-fractional
seconds.
Here is an updated patch that includes some of your ideas.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 453115f942..50a2c8e5f1 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2840,15 +2840,18 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
</para>
<para>
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- <literal>'1.5 week'</literal> or <literal>'01:02:03.45'</literal>. Such input is
- converted to the appropriate number of months, days, and seconds
- for storage. When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, <literal>'1.5 month'</literal> becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts: for example, <literal>'1.5
+ weeks'</literal> or <literal>'01:02:03.45'</literal>. However,
+ because interval internally stores only three integer units (months,
+ days, microseconds), fractional units must be spilled to smaller
+ units. Fractional parts of units greater than months is rounded to
+ be an integer number of months, e.g. <literal>'1.5 years'</literal>
+ becomes <literal>'1 year 6 mons'</literal>. Fractional parts of
+ weeks and days are computed to be an integer number of days and
+ microseconds, assuming 30 days per month and 24 hours per day, e.g.,
+ <literal>'1.75 months'</literal> becomes <literal>1 mon 22 days
+ 12:00:00</literal>. Only seconds will ever be shown as fractional
+ on output.
</para>
<para>
@@ -2892,10 +2895,10 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
<para>
Internally <type>interval</type> values are stored as months, days,
- and seconds. This is done because the number of days in a month
+ and microseconds. This is done because the number of days in a month
varies, and a day can have 23 or 25 hours if a daylight savings
time adjustment is involved. The months and days fields are integers
- while the seconds field can store fractions. Because intervals are
+ while the microseconds field can store fractional seconds. Because intervals are
usually created from constant strings or <type>timestamp</type> subtraction,
this storage method works well in most cases, but can cause unexpected
results:
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 54ae632de2..cb3fa85892 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3306,29 +3306,25 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3561,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3597,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 02b3c47223..a7e530cb5d 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,7 +155,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -191,7 +191,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -528,29 +528,25 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
On Fri, Jul 30, 2021 at 3:03 PM Bruce Momjian <bruce@momjian.us> wrote:
Here is an updated patch that includes some of your ideas.
Just to be clear, I am against this patch. I don't think it's a
minimal change for the reported problem, and I think some people will
be unhappy about the behavior changes.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Fri, Jul 30, 2021 at 03:08:56PM -0400, Robert Haas wrote:
On Fri, Jul 30, 2021 at 3:03 PM Bruce Momjian <bruce@momjian.us> wrote:
Here is an updated patch that includes some of your ideas.
Just to be clear, I am against this patch. I don't think it's a
minimal change for the reported problem, and I think some people will
be unhappy about the behavior changes.
Uh, what do you suggest then? You wanted the years/months fixed, and
rounding at spill stop time makes sense, and fixes the problem.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Fri, Jul 30, 2021 at 3:20 PM Bruce Momjian <bruce@momjian.us> wrote:
Uh, what do you suggest then? You wanted the years/months fixed, and
rounding at spill stop time makes sense, and fixes the problem.
Hmm, maybe I misunderstood. Are you saying that you think the patch
will fix cases like interval '-1.7 years 29.4 months' and interval
'29.4 months -1.7 years' to produce the same answer without changing
any other cases? I had the impression that you were proposing a bigger
change to the rules for converting fractional units to units of lower
type, particularly because Tom called it an "API change".
For some reason I can't apply the patch locally.
[rhaas pgsql]$ patch -p1 < ~/Downloads/interval.diff
(Stripping trailing CRs from patch.)
patching file doc/src/sgml/datatype.sgml
(Stripping trailing CRs from patch.)
patching file src/backend/utils/adt/datetime.c
patch: **** malformed patch at line 90: @@ -3601,7 +3597,7 @@
DecodeISO8601Interval(char *str,
--
Robert Haas
EDB: http://www.enterprisedb.com
Bruce Momjian <bruce@momjian.us> writes:
On Fri, Jul 30, 2021 at 03:08:56PM -0400, Robert Haas wrote:
Just to be clear, I am against this patch. I don't think it's a
minimal change for the reported problem, and I think some people will
be unhappy about the behavior changes.
Uh, what do you suggest then? You wanted the years/months fixed, and
rounding at spill stop time makes sense, and fixes the problem.
The complained-of bug is that 'X years Y months' isn't always
identical to 'Y months X years'. I do not believe that this patch
fixes that, though it may obscure the problem for some values of
X and Y. After a quick look at the code, I am suspicious that
the actual problem is that AdjustFractDays is applied at the wrong
time, before we've collected all the input. We probably need to
collect up all of the contributing input as floats and then do the
fractional spilling once at the end.
Having said that, I also agree that it's not great that '1.4 years'
is converted to '1 year 4 mons' (1.33333... years) rather than
'1 year 5 mons' (1.41666... years). The latter seems like a clearly
saner translation. I would really rather that we reject such input
altogether, but if we're going to accept it, we should round not
truncate.
regards, tom lane
On Fri, Jul 30, 2021 at 03:47:53PM -0400, Robert Haas wrote:
On Fri, Jul 30, 2021 at 3:20 PM Bruce Momjian <bruce@momjian.us> wrote:
Uh, what do you suggest then? You wanted the years/months fixed, and
rounding at spill stop time makes sense, and fixes the problem.Hmm, maybe I misunderstood. Are you saying that you think the patch
will fix cases like interval '-1.7 years 29.4 months' and interval
'29.4 months -1.7 years' to produce the same answer without changing
any other cases? I had the impression that you were proposing a bigger
Yes, tests from the oringal email:
SELECT interval '-1.7 years 29.4 months';
interval
----------------
9 mons 12 days
(1 row)
SELECT interval '29.4 months -1.7 years';
interval
----------------
9 mons 12 days
(1 row)
SELECT interval '-1.7 years' + interval '29.4 months';
?column?
----------------
9 mons 12 days
(1 row)
SELECT interval '29.4 months' + interval '-1.7 years';
?column?
----------------
9 mons 12 days
change to the rules for converting fractional units to units of lower
type, particularly because Tom called it an "API change".
The API change is to _round_ units greater than months to integeral
month values; we currently truncate. Changing the spill behavior has
been rejected.
For some reason I can't apply the patch locally.
[rhaas pgsql]$ patch -p1 < ~/Downloads/interval.diff
(Stripping trailing CRs from patch.)
patching file doc/src/sgml/datatype.sgml
(Stripping trailing CRs from patch.)
patching file src/backend/utils/adt/datetime.c
patch: **** malformed patch at line 90: @@ -3601,7 +3597,7 @@
DecodeISO8601Interval(char *str,
Uh, here is the patch again, in case that helps.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
interval.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 453115f942..50a2c8e5f1 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2840,15 +2840,18 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
</para>
<para>
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- <literal>'1.5 week'</literal> or <literal>'01:02:03.45'</literal>. Such input is
- converted to the appropriate number of months, days, and seconds
- for storage. When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, <literal>'1.5 month'</literal> becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts: for example, <literal>'1.5
+ weeks'</literal> or <literal>'01:02:03.45'</literal>. However,
+ because interval internally stores only three integer units (months,
+ days, microseconds), fractional units must be spilled to smaller
+ units. Fractional parts of units greater than months is rounded to
+ be an integer number of months, e.g. <literal>'1.5 years'</literal>
+ becomes <literal>'1 year 6 mons'</literal>. Fractional parts of
+ weeks and days are computed to be an integer number of days and
+ microseconds, assuming 30 days per month and 24 hours per day, e.g.,
+ <literal>'1.75 months'</literal> becomes <literal>1 mon 22 days
+ 12:00:00</literal>. Only seconds will ever be shown as fractional
+ on output.
</para>
<para>
@@ -2892,10 +2895,10 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
<para>
Internally <type>interval</type> values are stored as months, days,
- and seconds. This is done because the number of days in a month
+ and microseconds. This is done because the number of days in a month
varies, and a day can have 23 or 25 hours if a daylight savings
time adjustment is involved. The months and days fields are integers
- while the seconds field can store fractions. Because intervals are
+ while the microseconds field can store fractional seconds. Because intervals are
usually created from constant strings or <type>timestamp</type> subtraction,
this storage method works well in most cases, but can cause unexpected
results:
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 54ae632de2..cb3fa85892 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3306,29 +3306,25 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
case DTK_YEAR:
tm->tm_year += val;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = DTK_M(DECADE);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = DTK_M(CENTURY);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = DTK_M(MILLENNIUM);
break;
@@ -3565,7 +3561,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -3601,7 +3597,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
diff --git a/src/interfaces/ecpg/pgtypeslib/interval.c b/src/interfaces/ecpg/pgtypeslib/interval.c
index 02b3c47223..a7e530cb5d 100644
--- a/src/interfaces/ecpg/pgtypeslib/interval.c
+++ b/src/interfaces/ecpg/pgtypeslib/interval.c
@@ -155,7 +155,7 @@ DecodeISO8601Interval(char *str,
{
case 'Y':
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
break;
case 'M':
tm->tm_mon += val;
@@ -191,7 +191,7 @@ DecodeISO8601Interval(char *str,
return DTERR_BAD_FORMAT;
tm->tm_year += val;
- tm->tm_mon += (fval * MONTHS_PER_YEAR);
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
if (unit == '\0')
return 0;
if (unit == 'T')
@@ -528,29 +528,25 @@ DecodeInterval(char **field, int *ftype, int nf, /* int range, */
case DTK_YEAR:
tm->tm_year += val;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_DECADE:
tm->tm_year += val * 10;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_CENTURY:
tm->tm_year += val * 100;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
case DTK_MILLENNIUM:
tm->tm_year += val * 1000;
- if (fval != 0)
- tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+ tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
tmask = (fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR);
break;
On Fri, Jul 30, 2021 at 03:54:42PM -0400, Tom Lane wrote:
Having said that, I also agree that it's not great that '1.4 years'
is converted to '1 year 4 mons' (1.33333... years) rather than
'1 year 5 mons' (1.41666... years). The latter seems like a clearly
saner translation. I would really rather that we reject such input
altogether, but if we're going to accept it, we should round not
truncate.
My patch returns what you want:
SELECT interval '1.4 years';
interval
---------------
1 year 5 mons
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Bruce Momjian <bruce@momjian.us> writes:
On Fri, Jul 30, 2021 at 03:54:42PM -0400, Tom Lane wrote:
Having said that, I also agree that it's not great that '1.4 years'
is converted to '1 year 4 mons' (1.33333... years) rather than
'1 year 5 mons' (1.41666... years). The latter seems like a clearly
saner translation. I would really rather that we reject such input
altogether, but if we're going to accept it, we should round not
truncate.
My patch returns what you want:
Yeah, as far as that point goes, I was replying to Robert's
objection not your patch.
regards, tom lane
On Fri, Jul 30, 2021 at 03:03:13PM -0400, Bruce Momjian wrote:
On output, the months field is shown as an appropriate number of
years and months; the days field is shown as-is; the microseconds
field is converted to hours, minutes, and possibly-fractional
seconds.Here is an updated patch that includes some of your ideas.
"Rounding" patch applied to master, and back branches got only the
adjusted "truncated" doc patch.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.