Inconsistent Parsing of Offsets with Seconds
Hackers,
The treatment of timestamptz (and timetz) values with offsets that include seconds seems a bit inconsistent. One can create such timestamps through the input function:
david=# select '2024-06-22T12:35:00+02:30:15'::timestamptz;
timestamptz
------------------------
2024-06-22 10:04:45+00
But the offset seconds are dropped (or rounded away?) by to_timestamp()’s `OF` and `TZ` formats[2]https://www.postgresql.org/docs/16/functions-formatting.html:
david=# select to_timestamp('2024-06-03 12:35:00+02:30:15', 'YYYY-MM-DD HH24:MI:SSOF');
to_timestamp
------------------------
2024-06-03 10:05:00+00
david=# select to_timestamp('2024-06-03 12:35:00+02:30:15', 'YYYY-MM-DD HH24:MI:SSTZ');
to_timestamp
------------------------
2024-06-03 02:05:00-08
The corresponding jsonpath methods don’t like offsets with seconds *at all*:
david=# select jsonb_path_query('"2024-06-03 12:35:00+02:30:15"', '$.datetime("YYYY-MM-DD HH24:MI:SSOF")');
ERROR: trailing characters remain in input string after datetime format
david=# select jsonb_path_query('"2024-06-03 12:35:00+02:30:15"', '$.timestamp_tz()');
ERROR: timestamp_tz format is not recognized: "2024-06-03 12:35:00+02:30:15"
I see from the source[1]https://github.com/postgres/postgres/blob/70a845c/src/include/datatype/timestamp.h#L136-L142 that offsets between plus or minus 15:59:59 are allowed; should the `OF` and `TZ formats be able to parse them? Or perhaps there should be a `TZS` format to complement `TZH` and `TZM`?
Best,
David
[1]: https://github.com/postgres/postgres/blob/70a845c/src/include/datatype/timestamp.h#L136-L142
[2]: https://www.postgresql.org/docs/16/functions-formatting.html
"David E. Wheeler" <david@justatheory.com> writes:
The treatment of timestamptz (and timetz) values with offsets that include seconds seems a bit inconsistent.
It's hard to get excited about this. Per the IANA TZ data,
nowhere in the world has used fractional-minute UT offsets
since 1972:
# In 1972 Liberia was the last country to switch from a UT offset
# that was not a multiple of 15 or 20 minutes.
and they were twenty years later than the next-to-last place (although
IANA will steadfastly deny reliability for their TZ data before 1970).
So timestamps like this simply don't exist in the wild.
The corresponding jsonpath methods don’t like offsets with seconds *at all*:
Perhaps that should be fixed, but it's pretty low-priority IMO.
I doubt there is any standard saying that JSON timestamps need
to be able to include that.
I see from the source[1] that offsets between plus or minus 15:59:59
are allowed; should the `OF` and `TZ formats be able to parse them?
I'd vote no. to_date/to_char already have enough trouble with format
strings being squishier than one might expect.
regards, tom lane
On Jun 22, 2024, at 13:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
It's hard to get excited about this.
I freely admit I’m getting into the weeds here. :-)
The corresponding jsonpath methods don’t like offsets with seconds *at all*:
Perhaps that should be fixed, but it's pretty low-priority IMO.
I doubt there is any standard saying that JSON timestamps need
to be able to include that.I see from the source[1] that offsets between plus or minus 15:59:59
are allowed; should the `OF` and `TZ formats be able to parse them?I'd vote no. to_date/to_char already have enough trouble with format
strings being squishier than one might expect.
I believe the former issue is caused by the latter: The jsonpath implementation uses the formatting strings to parse the timestamps[1]https://github.com/postgres/postgres/blob/70a845c/src/backend/utils/adt/jsonpath_exec.c#L2420-L2442, and since there is no formatting to support offsets with seconds, it doesn’t work at all in JSON timestamp parsing.
[1]: https://github.com/postgres/postgres/blob/70a845c/src/backend/utils/adt/jsonpath_exec.c#L2420-L2442
So if we were to fix the parsing of offsets in jsonpath, we’d either have to change the parsing code there or augment the to_timestamp() formats and use them.
Totally agree not a priority; happy to just pretend offsets with seconds don’t exist in any practical sense.
Best,
David
On Jun 22, 2024, at 14:10, David E. Wheeler <david@justatheory.com> wrote:
I believe the former issue is caused by the latter: The jsonpath implementation uses the formatting strings to parse the timestamps[1], and since there is no formatting to support offsets with seconds, it doesn’t work at all in JSON timestamp parsing.
[1]: https://github.com/postgres/postgres/blob/70a845c/src/backend/utils/adt/jsonpath_exec.c#L2420-L2442
A side-effect of this implementation of date/time parsing using the to_char templates is that only time zone offsets and abbreviations are supported. I find the behavior a little surprising TBH:
david=# select to_timestamp('2024-06-03 12:35:00America/New_York', 'YYYY-MM-DD HH24:MI:SSTZ');
ERROR: invalid value "America/New_York" for "TZ"
DETAIL: Time zone abbreviation is not recognized.
Unless the SQL standard only supports offsets and abbreviations, I wonder if we’d be better off updating the above parsing code to also try the various date/time input functions, as well as the custom formats that *are* defined by the standard.
Best,