002_types.pl fails on some timezones on windows

Started by Andres Freundover 4 years ago22 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

CI showed me a failure in 002_types.pl on windows. I only just now noticed
that because the subscription tests aren't run by any of the vcregress.pl
steps :(

It turns out to be dependant on the current timezone. I have just about zero
understanding how timezones work on windows, so I can't really interpret why
that causes a problem on windows, but apparently not on linux.

The CI instance not unreasonably runs with the timezone set to GMT. With that
the tests fail. If I set it to PST, they work. For the detailed (way too long)
output see [1]https://api.cirrus-ci.com/v1/task/5800120848482304/logs/check_tz_sub.log. The relevant excerpt:

tzutil /s "Pacific Standard Time"
...
timeout -k60s 30m perl src/tools/msvc/vcregress.pl taptest .\src\test\subscription\ || true
t/002_types.pl ..................... ok
..

tzutil /s "Greenwich Standard Time"
timeout -k60s 30m perl src/tools/msvc/vcregress.pl taptest .\src\test\subscription\ || true
..
# Failed test 'check replicated inserts on subscriber'
# at t/002_types.pl line 278.
# got: '1|{1,2,3}
...
# 5|[5,51)
# 1|["2014-08-04 00:00:00+02",infinity)|{"[1,3)","[10,21)"}
# 2|["2014-08-02 01:00:00+02","2014-08-04 00:00:00+02")|{"[2,4)","[20,31)"}
# 3|["2014-08-01 01:00:00+02","2014-08-04 00:00:00+02")|{"[3,5)"}
# 4|["2014-07-31 01:00:00+02","2014-08-04 00:00:00+02")|{"[4,6)",NULL,"[40,51)"}
...
# expected: '1|{1,2,3}
...
# 1|["2014-08-04 00:00:00+02",infinity)|{"[1,3)","[10,21)"}
# 2|["2014-08-02 00:00:00+02","2014-08-04 00:00:00+02")|{"[2,4)","[20,31)"}
# 3|["2014-08-01 00:00:00+02","2014-08-04 00:00:00+02")|{"[3,5)"}
# 4|["2014-07-31 00:00:00+02","2014-08-04 00:00:00+02")|{"[4,6)",NULL,"[40,51)"}
...

Greetings,

Andres Freund

[1]: https://api.cirrus-ci.com/v1/task/5800120848482304/logs/check_tz_sub.log

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#1)
Re: 002_types.pl fails on some timezones on windows

Andres Freund <andres@anarazel.de> writes:

It turns out to be dependant on the current timezone. I have just about zero
understanding how timezones work on windows, so I can't really interpret why
that causes a problem on windows, but apparently not on linux.

Weird. Unless you're using --with-system-tzdata, I wouldn't expect that
code to work any differently on Windows.

regards, tom lane

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#1)
Re: 002_types.pl fails on some timezones on windows

On 9/30/21 2:36 PM, Andres Freund wrote:

Hi,

CI showed me a failure in 002_types.pl on windows. I only just now noticed
that because the subscription tests aren't run by any of the vcregress.pl
steps :(

We have windows buildfarm animals running the subscription tests, e.g.
<https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=drongo&amp;dt=2021-09-29%2019%3A08%3A23&amp;stg=subscription-check&gt;
and they do it by calling vcregress.pl.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#3)
Re: 002_types.pl fails on some timezones on windows

Andrew Dunstan <andrew@dunslane.net> writes:

On 9/30/21 2:36 PM, Andres Freund wrote:

CI showed me a failure in 002_types.pl on windows. I only just now noticed
that because the subscription tests aren't run by any of the vcregress.pl
steps :(

We have windows buildfarm animals running the subscription tests, e.g.
<https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=drongo&amp;dt=2021-09-29%2019%3A08%3A23&amp;stg=subscription-check&gt;
and they do it by calling vcregress.pl.

But are they running with the prevailing zone set to "Greenwich Standard
Time"?

I dug around to see exactly how we handle that, and was somewhat
gobsmacked to find this mapping in findtimezone.c:

/* (UTC+00:00) Monrovia, Reykjavik */
"Greenwich Standard Time", "Greenwich Daylight Time",
"Africa/Casablanca"

According to current tzdb,

# Zone NAME STDOFF RULES FORMAT [UNTIL]
Zone Africa/Casablanca -0:30:20 - LMT 1913 Oct 26
0:00 Morocco +00/+01 1984 Mar 16
1:00 - +01 1986
0:00 Morocco +00/+01 2018 Oct 28 3:00
1:00 Morocco +01/+00

Morocco has had weird changes-every-year DST rules since 2008, which'd
go a long way towards explaining funny behavior with this zone, even
without the "reverse DST" since 2018. And sure enough, 002_types.pl
falls over with TZ=Africa/Casablanca on my Linux machine, too.

I'm inclined to think we ought to be translating that zone name to
Europe/London instead. Or maybe we should translate to straight-up UTC?
But the option of "Greenwich Daylight Time" suggests that Windows thinks
this means UK civil time, not UTC.

I wonder if findtimezone.c has any other surprising Windows mappings.
I've never dug through that list particularly.

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#4)
Re: 002_types.pl fails on some timezones on windows

I wrote:

... sure enough, 002_types.pl
falls over with TZ=Africa/Casablanca on my Linux machine, too.

Independently of whether Africa/Casablanca is a sane translation of
that Windows zone name, it'd be nice if 002_types.pl weren't so
sensitive to the prevailing zone. I looked into exactly why it's
falling over, and the answer seems to be this bit:

(2, tstzrange('Mon Aug 04 00:00:00 2014 CEST'::timestamptz - interval '2 days', 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz), '{"[2,3]", "[20,30]"}'),
(3, tstzrange('Mon Aug 04 00:00:00 2014 CEST'::timestamptz - interval '3 days', 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz), '{"[3,4]"}'),
(4, tstzrange('Mon Aug 04 00:00:00 2014 CEST'::timestamptz - interval '4 days', 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz), '{"[4,5]", NULL, "[40,50]"}'),

The problem with this is the blithe assumption that "minus N days"
is an immutable computation. It ain't. As bad luck would have it,
these intervals all manage to cross a Moroccan DST boundary
(Ramadan, I assume):

Rule Morocco 2014 only - Jun 28 3:00 0 -
Rule Morocco 2014 only - Aug 2 2:00 1:00 -

Thus, in GMT or most other zones, we get 24-hour-spaced times of day for
these calculations:

regression=# set timezone to 'GMT';
SET
regression=# select n, 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz - n * interval '1 day' from generate_series(0,4) n;
n | ?column?
---+------------------------
0 | 2014-08-03 22:00:00+00
1 | 2014-08-02 22:00:00+00
2 | 2014-08-01 22:00:00+00
3 | 2014-07-31 22:00:00+00
4 | 2014-07-30 22:00:00+00
(5 rows)

but not so much in Morocco:

regression=# set timezone to 'Africa/Casablanca';
SET
regression=# select n, 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz - n * interval '1 day' from generate_series(0,4) n;
n | ?column?
---+------------------------
0 | 2014-08-03 23:00:00+01
1 | 2014-08-02 23:00:00+01
2 | 2014-08-01 23:00:00+00
3 | 2014-07-31 23:00:00+00
4 | 2014-07-30 23:00:00+00
(5 rows)

What I'm inclined to do about that is get rid of the totally-irrelevant-
to-this-test interval subtractions, and just write the desired timestamps
as constants.

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#3)
Re: 002_types.pl fails on some timezones on windows

Hi,

On 2021-09-30 15:19:30 -0400, Andrew Dunstan wrote:

On 9/30/21 2:36 PM, Andres Freund wrote:

Hi,

CI showed me a failure in 002_types.pl on windows. I only just now noticed
that because the subscription tests aren't run by any of the vcregress.pl
steps :(

We have windows buildfarm animals running the subscription tests, e.g.
<https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=drongo&amp;dt=2021-09-29%2019%3A08%3A23&amp;stg=subscription-check&gt;
and they do it by calling vcregress.pl.

The point I was trying to make is that there's no "target" in vcregress.pl for
it. You have to know that you need to call
src/tools/msvc/vcregress.pl taptest src\test\subscription
to run them. Contrasting to recoverycheck or so, which has it's own
vcregress.pl target.

Greetings,

Andres Freund

#7Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#4)
Re: 002_types.pl fails on some timezones on windows

On Fri, Oct 1, 2021 at 8:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

But the option of "Greenwich Daylight Time" suggests that Windows thinks
this means UK civil time, not UTC.

Yes, it's been a while but IIRC Windows in the UK uses confusing
terminology here even in user interfaces, so that in summer it appears
to be wrong, which is annoying to anyone brought up on Eggert's
system. The CLDR windowsZones.xml file shows this.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#1)
Re: 002_types.pl fails on some timezones on windows

Andres Freund <andres@anarazel.de> writes:

It turns out to be dependant on the current timezone. I have just about zero
understanding how timezones work on windows, so I can't really interpret why
that causes a problem on windows, but apparently not on linux.

As of 20f8671ef, "TZ=Africa/Casablanca make check-world" passes here,
so your CI should be okay. We still oughta fix the Windows
translation, though.

regards, tom lane

#9Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: 002_types.pl fails on some timezones on windows

Hi,

On 2021-09-30 16:03:15 -0400, Tom Lane wrote:

I wrote:

... sure enough, 002_types.pl
falls over with TZ=Africa/Casablanca on my Linux machine, too.

Independently of whether Africa/Casablanca is a sane translation of
that Windows zone name, it'd be nice if 002_types.pl weren't so
sensitive to the prevailing zone. I looked into exactly why it's
falling over, and the answer seems to be this bit:

(2, tstzrange('Mon Aug 04 00:00:00 2014 CEST'::timestamptz - interval '2 days', 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz), '{"[2,3]", "[20,30]"}'),
(3, tstzrange('Mon Aug 04 00:00:00 2014 CEST'::timestamptz - interval '3 days', 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz), '{"[3,4]"}'),
(4, tstzrange('Mon Aug 04 00:00:00 2014 CEST'::timestamptz - interval '4 days', 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz), '{"[4,5]", NULL, "[40,50]"}'),

The problem with this is the blithe assumption that "minus N days"
is an immutable computation. It ain't. As bad luck would have it,
these intervals all manage to cross a Moroccan DST boundary
(Ramadan, I assume):

For a minute I was confused, because of course we should still get the same
result on the subscriber as on the publisher. But then I re-re-re-realized
that the comparison data is a constant in the test script...

What I'm inclined to do about that is get rid of the totally-irrelevant-
to-this-test interval subtractions, and just write the desired timestamps
as constants.

Sounds like a plan.

Greetings,

Andres Freund

#10Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#8)
Re: 002_types.pl fails on some timezones on windows

Hi,

On 2021-09-30 16:31:33 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

It turns out to be dependant on the current timezone. I have just about zero
understanding how timezones work on windows, so I can't really interpret why
that causes a problem on windows, but apparently not on linux.

As of 20f8671ef, "TZ=Africa/Casablanca make check-world" passes here,
so your CI should be okay. We still oughta fix the Windows
translation, though.

Indeed, it just passed (after reverting my timezone workaround):
https://cirrus-ci.com/task/5899963000422400?logs=check#L129

It still fails in t/026_overwrite_contrecord.pl though. But that's another
thread.

Thanks!

Andres

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#7)
Re: 002_types.pl fails on some timezones on windows

Thomas Munro <thomas.munro@gmail.com> writes:

Yes, it's been a while but IIRC Windows in the UK uses confusing
terminology here even in user interfaces, so that in summer it appears
to be wrong, which is annoying to anyone brought up on Eggert's
system. The CLDR windowsZones.xml file shows this.

Oh, thanks for the pointer to CLDR! I tried re-generating our data
based on theirs, and ended up with the attached draft patch.
My notes summarizing the changes say:

Choose Europe/London for "Greenwich Standard Time"
(CLDR doesn't do this, but all their mappings for it are insane)

Alphabetize a bit better

Zone name changes:

Jerusalem Standard Time -> Israel Standard Time

Numerous Russian zones slightly renamed

Should we preserve the old spellings of the above? It's not clear
how long-obsolete the old spellings are.

Maybe politically sensitive:

Asia/Hong_Kong -> Asia/Shanghai

I think the latter has way better claim on "China Standard Time",
and CLDR agrees.

Resolve Links to underlying real zones:

Asia/Kuwait -> Asia/Riyadh
Asia/Muscat -> Asia/Dubai
Australia/Canberra -> Australia/Sydney
Canada/Atlantic -> America/Halifax
Canada/Newfoundland -> America/St_Johns
Canada/Saskatchewan -> America/Regina
US/Alaska -> America/Anchorage
US/Arizona -> America/Phoenix
US/Central -> America/Chicago
US/Eastern -> America/New_York
US/Hawaii -> Pacific/Honolulu
US/Mountain -> America/Denver
US/Pacific -> America/Los_Angeles

Just plain wrong:

US/Aleutan (misspelling of US/Aleutian, which is a link anyway)

America/Salvador does not exist; tzdb says
# There are too many Salvadors elsewhere, so use America/Bahia instead
# of America/Salvador.

Etc/UTC+12 doesn't exist in tzdb

Indiana (East) is not the regular US/Eastern zone

Asia/Baku -> Asia/Yerevan (Baku is in Azerbaijan, Yerevan is in Armenia)

Asia/Dhaka -> Asia/Almaty (Dhaka has its own zone, and it's in Bangladesh
not Astana)

Europe/Sarajevo is a link to Europe/Belgrade these days, so use Warsaw

Chisinau is in Moldova not Romania

Chetumal is in Quintana Roo, which is represented by Cancun not Mexico City

Haiti has its own zone

America/Araguaina seems to just be a mistake; use Sao_Paulo

America/Buenos_Aires for SA Eastern Standard Time is a mistake
(it has its own zone)
likewise America/Caracas for SA Western Standard Time

Africa/Harare seems to be obsoleted by Africa/Johannesburg

Karachi is in Pakistan, not Tashkent

New Windows zones:

"South Sudan Standard Time" -> Africa/Juba

"West Bank Standard Time" -> Asia/Hebron
(CLDR seem to have this replacing Gaza, but I kept that one too)

"Yukon Standard Time" -> America/Whitehorse

uncomment "W. Central Africa Standard Time" as Africa/Lagos

regards, tom lane

Attachments:

windows-timezone-map-updates.patchtext/x-diff; charset=us-ascii; name=windows-timezone-map-updates.patchDownload+104-122
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#11)
Re: 002_types.pl fails on some timezones on windows

Thomas Munro <thomas.munro@gmail.com> writes:

Yes, it's been a while but IIRC Windows in the UK uses confusing
terminology here even in user interfaces, so that in summer it appears
to be wrong, which is annoying to anyone brought up on Eggert's
system. The CLDR windowsZones.xml file shows this.

BTW, on closer inspection of CLDR's data, the Windows zone name they
associate with Europe/London is "GMT Standard Time". "Greenwich Standard
Time" is associated with a bunch of places that happen to lie near the
prime meridian, but whose timekeeping likely has nothing to do with UK
civil time:

<!-- (UTC+00:00) Monrovia, Reykjavik -->
<mapZone other="Greenwich Standard Time" territory="001" type="Atlantic/Reykjavik"/>
<mapZone other="Greenwich Standard Time" territory="BF" type="Africa/Ouagadougou"/>
<mapZone other="Greenwich Standard Time" territory="CI" type="Africa/Abidjan"/>
<mapZone other="Greenwich Standard Time" territory="GH" type="Africa/Accra"/>
<mapZone other="Greenwich Standard Time" territory="GL" type="America/Danmarkshavn"/>
<mapZone other="Greenwich Standard Time" territory="GM" type="Africa/Banjul"/>
<mapZone other="Greenwich Standard Time" territory="GN" type="Africa/Conakry"/>
<mapZone other="Greenwich Standard Time" territory="GW" type="Africa/Bissau"/>
<mapZone other="Greenwich Standard Time" territory="IS" type="Atlantic/Reykjavik"/>
<mapZone other="Greenwich Standard Time" territory="LR" type="Africa/Monrovia"/>
<mapZone other="Greenwich Standard Time" territory="ML" type="Africa/Bamako"/>
<mapZone other="Greenwich Standard Time" territory="MR" type="Africa/Nouakchott"/>
<mapZone other="Greenwich Standard Time" territory="SH" type="Atlantic/St_Helena"/>
<mapZone other="Greenwich Standard Time" territory="SL" type="Africa/Freetown"/>
<mapZone other="Greenwich Standard Time" territory="SN" type="Africa/Dakar"/>
<mapZone other="Greenwich Standard Time" territory="TG" type="Africa/Lome"/>

So arguably, the problem that started this thread was Andres' user
error: I doubt he expected "Greenwich Standard Time" to mean any
of these. Still, I think we're better off to map that to London,
because he won't be the only one to make that mistake.

BTW, I find those "territory" annotations in the CLDR data to be
fascinating. If that corresponds to something that we could retrieve
at runtime, it'd allow far better mapping of Windows zones than we
are doing now. I have no interest in working on that myself though.

regards, tom lane

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#4)
Re: 002_types.pl fails on some timezones on windows

On 9/30/21 3:38 PM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 9/30/21 2:36 PM, Andres Freund wrote:

CI showed me a failure in 002_types.pl on windows. I only just now noticed
that because the subscription tests aren't run by any of the vcregress.pl
steps :(

We have windows buildfarm animals running the subscription tests, e.g.
<https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=drongo&amp;dt=2021-09-29%2019%3A08%3A23&amp;stg=subscription-check&gt;
and they do it by calling vcregress.pl.

But are they running with the prevailing zone set to "Greenwich Standard
Time"?

drongo's timezone is set to plain "UTC".

It also offers me "UTC+00:00(Dublin, Edinburgh, Lisbon, London)" and
"UTC+00:00(Monrovia, Reykjavik)"

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#13)
Re: 002_types.pl fails on some timezones on windows

Andrew Dunstan <andrew@dunslane.net> writes:

On 9/30/21 3:38 PM, Tom Lane wrote:

But are they running with the prevailing zone set to "Greenwich Standard
Time"?

drongo's timezone is set to plain "UTC".

It also offers me "UTC+00:00(Dublin, Edinburgh, Lisbon, London)" and
"UTC+00:00(Monrovia, Reykjavik)"

Yeah, the last of those is (was) the problematic one.

regards, tom lane

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#11)
Re: 002_types.pl fails on some timezones on windows

I wrote:

Oh, thanks for the pointer to CLDR! I tried re-generating our data
based on theirs, and ended up with the attached draft patch.

Hearing no objections, pushed after another round of review
and a couple more fixes.

For the archives' sake, here are the remaining discrepancies
between our mapping and CLDR's entries for "territory 001",
which I take to be their recommended defaults:

* Our documented decision to map "Central America" to "CST6",
on the grounds that most of Central America doesn't actually
observe DST nowadays.

* Now-documented decision to map "Greenwich Standard Time"
to Europe/London, not Atlantic/Reykjavik as they have it.

* The miscellaneous deltas shown in the attached diff, which in
many cases boil down to "we chose the first name mentioned for the
zone, while CLDR did something else". I felt that our historical
mappings of these cases weren't wrong enough to justify any
political flak I might take for changing them. OTOH, maybe we
should just say "we follow CLDR" and be done with it.

regards, tom lane

Attachments:

remaining-cldr-discrepancies.patchtext/x-diff; charset=us-ascii; name=remaining-cldr-discrepancies.patchDownload+8-8
#16Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#15)
Re: 002_types.pl fails on some timezones on windows

On Sun, Oct 3, 2021 at 9:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* Now-documented decision to map "Greenwich Standard Time"
to Europe/London, not Atlantic/Reykjavik as they have it.

Hmm. It's hard to pick a default from that set of merged zones, but
the funny thing about this choice is that Europe/London is the one
Olson zone that it's sure *not* to be, because then your system would
be using that other name, IIUC.

* The miscellaneous deltas shown in the attached diff, which in
many cases boil down to "we chose the first name mentioned for the
zone, while CLDR did something else". I felt that our historical
mappings of these cases weren't wrong enough to justify any
political flak I might take for changing them. OTOH, maybe we
should just say "we follow CLDR" and be done with it.

Eyeballing these, three look strange to me in a list of otherwise
city-based names: Pacific/Guam (instead of Port Moresby, capital of
PNG which apparently shares zone rules with the territory of Guam) and
Pacific/Samoa (country name instead of its capital Apia; the city
avoids any potential confusion with American Samoa which is on the
other side of the date line) and then "CET", an abbreviation. But
debating individual points of geography and politics like this seems a
bit silly... I wasn't really aware of this Windows->Olson zone name
problem lurking in our tree before, but it sounds to me like switching
to 100% "we use CLDR, if you think it's wrong, please file a report at
cldr.unicode.org" wouldn't be a bad idea at all!

#17Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#12)
Re: 002_types.pl fails on some timezones on windows

On Sat, Oct 2, 2021 at 1:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, I find those "territory" annotations in the CLDR data to be
fascinating. If that corresponds to something that we could retrieve
at runtime, it'd allow far better mapping of Windows zones than we
are doing now. I have no interest in working on that myself though.

I wonder if it could be derived from the modern standards-based locale
name, which we're not currently using as a default locale but probably
should[1]/messages/by-id/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw@mail.gmail.com. For single-zone countries you might be able to match
exactly one zone mapping.

[1]: /messages/by-id/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw@mail.gmail.com

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#16)
Re: 002_types.pl fails on some timezones on windows

Thomas Munro <thomas.munro@gmail.com> writes:

On Sun, Oct 3, 2021 at 9:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* Now-documented decision to map "Greenwich Standard Time"
to Europe/London, not Atlantic/Reykjavik as they have it.

Hmm. It's hard to pick a default from that set of merged zones, but
the funny thing about this choice is that Europe/London is the one
Olson zone that it's sure *not* to be, because then your system would
be using that other name, IIUC.

Agreed, this choice is definitely formally wrong. However, the example
we started the thread with is that Andres thought "Greenwich Standard
Time" would get him UTC, or at least something a lot less oddball than
what he got.

But wait a minute ... looking into the tzdb sources, I find that Iceland
hasn't observed DST since 1968, and tzdb spells their zone abbreviation as
"GMT" since then. That means that Atlantic/Reykjavik is actually a way
better approximation to "plain GMT" than Europe/London is. Maybe there
is some method in CLDR's madness here.

* The miscellaneous deltas shown in the attached diff, which in
many cases boil down to "we chose the first name mentioned for the
zone, while CLDR did something else". I felt that our historical
mappings of these cases weren't wrong enough to justify any
political flak I might take for changing them. OTOH, maybe we
should just say "we follow CLDR" and be done with it.

Eyeballing these, three look strange to me in a list of otherwise
city-based names: Pacific/Guam (instead of Port Moresby, capital of
PNG which apparently shares zone rules with the territory of Guam) and
Pacific/Samoa (country name instead of its capital Apia; the city
avoids any potential confusion with American Samoa which is on the
other side of the date line) and then "CET", an abbreviation.

Oooh. Looking closer, I see that the Windows zone is defined as
<!-- (UTC+13:00) Samoa -->
which makes it *definitely* Pacific/Apia ... Pacific/Samoa is a
link to Pacific/Pago_Pago which is in American Samoa, at UTC-11.
So our mapping was kind of okay up till 2011 when Samoa decided
they wanted to be on the other side of the date line, but now
it's wrong as can be. Ooops.

But
debating individual points of geography and politics like this seems a
bit silly... I wasn't really aware of this Windows->Olson zone name
problem lurking in our tree before, but it sounds to me like switching
to 100% "we use CLDR, if you think it's wrong, please file a report at
cldr.unicode.org" wouldn't be a bad idea at all!

I'd still defend our exception for Central America: CLDR maps that
to Guatemala which seems pretty random, even if they haven't observed
DST there for a few years. For the rest of it, though, "we follow CLDR"
has definitely got a lot of attraction. The one change that makes me
nervous is adopting Europe/Berlin for "W. Europe Standard Time",
on account of the flak Paul Eggert just got from trying to make a
somewhat-similar change :-(. (If you don't read the tz mailing list
you may not be aware of that particular tempest in a teapot, but he
tried to merge a bunch of zones into Europe/Berlin, and there were
a lot of complaints. Some from me.)

regards, tom lane

#19Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#18)
Re: 002_types.pl fails on some timezones on windows

Hi,

On October 2, 2021 3:26:35 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, the example
we started the thread with is that Andres thought "Greenwich Standard
Time" would get him UTC, or at least something a lot less oddball than
what he got.

FWIW, that was just the default on those machines (which in turn seems to be the default of some containers Microsoft distributes), not something I explicitly chose.

- Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#20Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#18)
Re: 002_types.pl fails on some timezones on windows

On Sun, Oct 3, 2021 at 11:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Eyeballing these, three look strange to me in a list of otherwise
city-based names: Pacific/Guam (instead of Port Moresby, capital of
PNG which apparently shares zone rules with the territory of Guam) and
Pacific/Samoa (country name instead of its capital Apia; the city
avoids any potential confusion with American Samoa which is on the
other side of the date line) and then "CET", an abbreviation.

Oooh. Looking closer, I see that the Windows zone is defined as
<!-- (UTC+13:00) Samoa -->
which makes it *definitely* Pacific/Apia ... Pacific/Samoa is a
link to Pacific/Pago_Pago which is in American Samoa, at UTC-11.
So our mapping was kind of okay up till 2011 when Samoa decided
they wanted to be on the other side of the date line, but now
it's wrong as can be. Ooops.

Hah. That's a *terrible* link to have.

I'd still defend our exception for Central America: CLDR maps that
to Guatemala which seems pretty random, even if they haven't observed
DST there for a few years. For the rest of it, though, "we follow CLDR"
has definitely got a lot of attraction. The one change that makes me
nervous is adopting Europe/Berlin for "W. Europe Standard Time",
on account of the flak Paul Eggert just got from trying to make a
somewhat-similar change :-(.

It would be interesting to know if that idea of matching BCP47 locale
names to territories could address that. Perhaps we should get that
modern-locale-name patch first (I think I got stuck on "let's kill off
old Windows versions so we can use this", due to confusing versioning
and a lack of a guiding policy on our part, but I think I should just
propose something), and then revisit this?

(If you don't read the tz mailing list
you may not be aware of that particular tempest in a teapot, but he
tried to merge a bunch of zones into Europe/Berlin, and there were
a lot of complaints. Some from me.)

I don't follow the list but there was a nice summary in LWN: "A fork
for the time-zone database?". From the peanut gallery, I thought it
was a bit of a double standard, considering the rejection of that idea
of yours about getting rid of longitude-based pre-standard times on
data stability grounds, and a lot less justifiable. I hope there
isn't a fork.

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#19)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#20)