BUG #17240: <timestamptz> at time zone ... ; wrong result

Started by PG Bug reporting formover 4 years ago5 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17240
Logged by: Marek Läll
Email address: marek.lall@eesti.ee
PostgreSQL version: 13.4
Operating system: Debian
Description:

Hello!

# SELECT version();
version

------------------------------------------------------------------------------------------------------------------------
PostgreSQL 13.4 (Debian 13.4-1.pgdg100+1) on aarch64-unknown-linux-gnu,
compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
(1 row)

# SET TimeZone='UTC';

# select * from pg_timezone_names where name like 'America/Los_Angeles';
name | abbrev | utc_offset | is_dst
---------------------+--------+------------+--------
America/Los_Angeles | PDT | -07:00:00 | t
(1 row)

# select now()
, now() at time zone 'America/Los_Angeles' as correct
, now() at time zone '-07:00:00' as wrong;

now | correct |
wrong
-------------------------------+----------------------------+----------------------------
2021-10-21 09:29:07.591962+00 | 2021-10-21 02:29:07.591962 | 2021-10-21
16:29:07.591962
(1 row)

--> COLUMN "WRONG" IS EXPECTED TO HAVE THE SAME VALUE AS COLUMN "CORRECT"

# select timestamptz'2021-10-01 00:00:00 UTC' at time zone
'America/Los_Angeles' as correct_1
, timestamptz'2021-10-01 00:00:00 UTC' at time zone '-07:00' as wrong_1
, timestamptz'2021-10-01 00:00:00 America/Los_Angeles' at time zone
'America/Los_Angeles' as correct_2
, timestamptz'2021-10-01 00:00:00 -07:00:00' at time zone '-07:00' as
wrong_2
, timestamptz'2021-10-01 00:00:00 -07:00:00' at time zone
'America/Los_Angeles' as correct_3
, timestamptz'2021-10-01 00:00:00 America/Los_Angeles' at time zone
'-07:00' as wrong_3;

correct_1 | wrong_1 | correct_2 |
wrong_2 | correct_3 | wrong_3
---------------------+---------------------+---------------------+---------------------+---------------------+---------------------
2021-09-30 17:00:00 | 2021-10-01 07:00:00 | 2021-10-01 00:00:00 |
2021-10-01 14:00:00 | 2021-10-01 00:00:00 | 2021-10-01 14:00:00
(1 row)

--> COLUMNS "WRONG_<N>" ARE EXPECTED TO HAVE THE SAME VALUE AS COLUMN
"CORRECT_<N>"

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: PG Bug reporting form (#1)
Re: BUG #17240: <timestamptz> at time zone ... ; wrong result

PG Bug reporting form <noreply@postgresql.org> writes:

# select now()
, now() at time zone 'America/Los_Angeles' as correct
, now() at time zone '-07:00:00' as wrong;

Unfortunately, the pure-numeric syntax for time zone names follows the
POSIX sign convention, which is opposite to the ISO convention used
in pg_timezone_names.utc_offset (and in most other places in Postgres).
So "at time zone '+07:00:00'" is what you needed to write to duplicate
the 'America/Los_Angeles' result. See

https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES

Now, if you'd done this:

select ... now() at time zone interval '-07:00:00' as fine

you'd have gotten the ISO sign interpretation. But an undecorated
literal string defaults to being of type text, meaning you get
the time-zone-name logic path.

The great thing about standards is there are so many to choose from :-(

regards, tom lane

#3Marek Läll
lall.marek@gmail.com
In reply to: Tom Lane (#2)
Re: BUG #17240: <timestamptz> at time zone ... ; wrong result

Hi!

Thank you for your response, Tom.

I have multiple questions/comments but let's start one-by-one.

POSIX string (syntax) is defined as:
--> stdoffset[dst[offset][,start-date[/time],end-date[/time]]]

The std string specifies the name of the time zone.
It must be three or more characters long and must not contain a leading
colon, embedded digits, commas, nor plus and minus signs.
There is no space character separating the time zone name from the offset,
so these restrictions are necessary to parse the specification correctly.

For "Pacific Time (Canada & US)" aka "America/Los_Angeles" the POSIX
version looks like: "PST8PDT,M3.2.0,M11.1.0"

Here are results of experiment:

$ export TZ="PST8PDT,M3.2.0,M11.1.0" ; date --iso-8601=seconds
2021-10-24T12:10:51-07:00 <-- CORRECT

$ export TZ="UTC" ; date --iso-8601=seconds
2021-10-24T19:11:32+00:00 <-- CORRECT

$ export TZ="-07:00" ; date --iso-8601=seconds
2021-10-24T19:11:42+00:00 <-- TZ is IGNORED, because "-07:00" is not
valid POSIX syntax

$ export TZ="-0700" ; date --iso-8601=seconds
2021-10-24T19:11:46+00:00 <-- TZ is IGNORED, because "-0700" is not valid
POSIX syntax

$ export TZ="-07" ; date --iso-8601=seconds
2021-10-24T19:55:08+00:00 <-- TZ is IGNORED, because "-07" is not valid
POSIX syntax

$ export TZ="-07RandomMeaninglessString" ; date --iso-8601=seconds
2021-10-24T19:14:55+00:00 <-- TZ is IGNORED, because
"-07RandomMeaninglessString" is not valid POSIX syntax

$ uname -a
Linux 5.10.0-8-amd64 #1 SMP Debian 5.10.46-4 (2021-08-03) x86_64 GNU/Linux

Back to POSTGRES case:

postgres> select timestamptz'2021-10-01 07:00:00 UTC' at time zone

'-07:00' as revers_res2;

2021-10-01 14:00:00

The value '-07:00' is not a valid POSIX value but ... Postgres performs the
following (as concept):
1) first: reads in the string '-07:00'
2) then: ignores the fact that string '-07:00' is invalid POSIX value
3) then: decides to follow "POSIX sign convention" rules (still ignoring
previous fact)
4) and then: applies "POSIX sign convention" rules using invalid POSIX
value as input?

Regards
Marek

Kontakt Tom Lane (<tgl@sss.pgh.pa.us>) kirjutas kuupäeval N, 21. oktoober
2021 kell 16:41:

Show quoted text

PG Bug reporting form <noreply@postgresql.org> writes:

# select now()
, now() at time zone 'America/Los_Angeles' as correct
, now() at time zone '-07:00:00' as wrong;

Unfortunately, the pure-numeric syntax for time zone names follows the
POSIX sign convention, which is opposite to the ISO convention used
in pg_timezone_names.utc_offset (and in most other places in Postgres).
So "at time zone '+07:00:00'" is what you needed to write to duplicate
the 'America/Los_Angeles' result. See

https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES

Now, if you'd done this:

select ... now() at time zone interval '-07:00:00' as fine

you'd have gotten the ISO sign interpretation. But an undecorated
literal string defaults to being of type text, meaning you get
the time-zone-name logic path.

The great thing about standards is there are so many to choose from :-(

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marek Läll (#3)
Re: BUG #17240: <timestamptz> at time zone ... ; wrong result

=?UTF-8?Q?Marek_L=C3=A4ll?= <lall.marek@gmail.com> writes:

POSIX string (syntax) is defined as:
--> stdoffset[dst[offset][,start-date[/time],end-date[/time]]]

The std string specifies the name of the time zone.
It must be three or more characters long and must not contain a leading
colon, embedded digits, commas, nor plus and minus signs.

Hmm, you're reading the POSIX spec I guess, because our docs don't
say that ;-). The IANA tzdb code only enforces that STD not be empty,
and Postgres has modified it to allow empty STD as well. That's
an ancient backwards-compatibility decision that we likely ought
to change sometime, so I've intentionally not documented it in
appendix B.5 [1]https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html. The text in B.5 actually says that you need angle
brackets if you want any non-letters in STD or DST, which is more
conservative than what the IANA code will accept.

2) then: ignores the fact that string '-07:00' is invalid POSIX value

It's valid according to our interpretation of POSIX. Some experimentation
suggests that GNU date(1) enforces the POSIX definition exactly, which is
that STD be at least three characters, all alphabetic. That implies that
they wrote their own TZ parser, because the IANA reference code doesn't
act that way.

regards, tom lane

[1]: https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html

#5Marek Läll
lall.marek@gmail.com
In reply to: Tom Lane (#4)
Re: BUG #17240: <timestamptz> at time zone ... ; wrong result

Hi!

Thank you for your response.

Next question.

What about the following principle?

if there is a function
y = func( x, z)
and it is linear function (which is the case) then there is inverse
function
x = inverse_func( y, z)
and then the following is always true:
x = inverse_func( func( x, z), z)

Example with plus and minus:
y = x + z
x = y - z
x = (x + z) - z

Postgres follows the principle for valid POSIX values:
func(): timestamptz'2021-10-01 00:00:00 UTC-07:00' at time zone
'UTC' --> timestamp'2021-09-30 17:00:00'
inverse_func(): timestamptz'2021-09-30 17:00:00 UTC' at time zone
'UTC-07:00' --> timestamp'2021-10-01 00:00:00'

And Postgres follows the principle for values like 'America/Los_Angeles'.

But Postgres starts ignoring the principle for invalid POSIX (or should we
say non-POSIX) values:
func(): timestamptz'2021-10-01 00:00:00 -07:00' at time zone
'UTC' --> timestamp'2021-10-01 07:00:00'
inverse_func(): timestamptz'2021-10-01 07:00:00 UTC' at time zone
'-07:00' --> timestamp'2021-10-01 14:00:00'

Please note!
1) Values 'UTC-07:00' and '-07:00' are treated as different values if they
are specified inside a timestamp literal.
And this is expected behaviour. POSIX value and non-POSIX value are
recognized and handled properly.

2) Values 'UTC-07:00' and '-07:00' are treated as equal values if they are
specified as arguments for "at time zone".
This is NOT expected behaviour. Non-POSIX values must not be recognized
and handled as POSIX values.
Also, this behaviour breaks the simple principle described in the
beginning.

To be clear, once again, same examples, different angle.
Next 2 examples produce different result (expected):
timestamptz'2021-10-01 00:00:00 -07:00' at time zone 'UTC' -->
timestamp'2021-10-01 07:00:00'
timestamptz'2021-10-01 00:00:00 UTC-07:00' at time zone 'UTC' -->
timestamp'2021-09-30 17:00:00'

And next 2 examples produce the same result (not expected):
timestamptz'2021-10-01 07:00:00 UTC' at time zone '-07:00' -->
timestamp'2021-10-01 14:00:00'
timestamptz'2021-10-01 07:00:00 UTC' at time zone 'UTC-07:00' -->
timestamp'2021-10-01 14:00:00'

Summary:
I reported this case as the "at time zone" argument is not properly handled.
But you may say that the timestamptz literal is not properly handled.
Anyway, there is no consistent handling as you can see in examples above.

Regards
Marek

Kontakt Tom Lane (<tgl@sss.pgh.pa.us>) kirjutas kuupäeval E, 25. oktoober
2021 kell 19:26:

Show quoted text

=?UTF-8?Q?Marek_L=C3=A4ll?= <lall.marek@gmail.com> writes:

POSIX string (syntax) is defined as:
--> stdoffset[dst[offset][,start-date[/time],end-date[/time]]]

The std string specifies the name of the time zone.
It must be three or more characters long and must not contain a leading
colon, embedded digits, commas, nor plus and minus signs.

Hmm, you're reading the POSIX spec I guess, because our docs don't
say that ;-). The IANA tzdb code only enforces that STD not be empty,
and Postgres has modified it to allow empty STD as well. That's
an ancient backwards-compatibility decision that we likely ought
to change sometime, so I've intentionally not documented it in
appendix B.5 [1]. The text in B.5 actually says that you need angle
brackets if you want any non-letters in STD or DST, which is more
conservative than what the IANA code will accept.

2) then: ignores the fact that string '-07:00' is invalid POSIX value

It's valid according to our interpretation of POSIX. Some experimentation
suggests that GNU date(1) enforces the POSIX definition exactly, which is
that STD be at least three characters, all alphabetic. That implies that
they wrote their own TZ parser, because the IANA reference code doesn't
act that way.

regards, tom lane

[1]
https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html