Domain based on TIMEZONE WITH TIME ZONE

Started by Ben Hoodalmost 8 years ago36 messagesgeneral
Jump to latest
#1Ben Hood
ben@relops.com

Hi,

I'm using a domain to specialize the built in TIMEZONE WITH TIME ZONE type. I want to sanity check this approach before continuing to use this.

I want to prevent timestamps with non-UTC offsets from getting inserted into the database. Having a UTC-only database at the schema level means no app, proc, script or load operation can inadvertently supply a local time offset.

The domain is defined as:

CREATE DOMAIN UTC_TIMESTAMP AS TIMESTAMP WITH TIME ZONE CHECK (EXTRACT(TIMEZONE FROM VALUE) = 0);
CREATE TYPE UTC_RANGE AS RANGE ( SUBTYPE = UTC_TIMESTAMP );

My potentially misguided assumption is that I am effectively re-using a check constraint across the schema.

Is this crazy?

Am I missing the point of how Postgres stores TIMEZONE WITH TIME ZONE internally?

Thanks in advance,

Ben

#2Ben Hood
ben@relops.com
In reply to: Ben Hood (#1)
Re: Domain based on TIMEZONE WITH TIME ZONE

Sorry about the bug in the subject: the data type is TIMESTAMP WITH TIME ZONE, not TIMEZONE WITH TIME ZONE

Show quoted text

On 10 May 2018, at 09:03, Ben Hood <ben@relops.com> wrote:

Hi,

I'm using a domain to specialize the built in TIMEZONE WITH TIME ZONE type. I want to sanity check this approach before continuing to use this.

I want to prevent timestamps with non-UTC offsets from getting inserted into the database. Having a UTC-only database at the schema level means no app, proc, script or load operation can inadvertently supply a local time offset.

The domain is defined as:

CREATE DOMAIN UTC_TIMESTAMP AS TIMESTAMP WITH TIME ZONE CHECK (EXTRACT(TIMEZONE FROM VALUE) = 0);
CREATE TYPE UTC_RANGE AS RANGE ( SUBTYPE = UTC_TIMESTAMP );

My potentially misguided assumption is that I am effectively re-using a check constraint across the schema.

Is this crazy?

Am I missing the point of how Postgres stores TIMEZONE WITH TIME ZONE internally?

Thanks in advance,

Ben

#3Francisco Olarte
folarte@peoplecall.com
In reply to: Ben Hood (#1)
Re: Domain based on TIMEZONE WITH TIME ZONE

On Thu, May 10, 2018 at 10:03 AM, Ben Hood <ben@relops.com> wrote:
...

Am I missing the point of how Postgres stores TIMEZONE WITH TIME ZONE
internally?

After reading in the follow up TIMEZONE is a typo for TIMESTAMP, yes, you are.

Aproximately, postgres stores ( virtually ) a point in the time line,
for both with and without ts types, same format. Something like a real
number.

The difference is mainly for transforming from/to text ( bear in mind
when you put a constant in a query you are trasnforming from text ).
In the with time zone case it formats/expects it as a time string in
the session configured time zone, in the without case it treats it (
aproximately ) as if it was in utc ( and then discards the "+00" after
formating ).

Maybe I'm confussing you more, its not too easy to explain.

The point is TIMEZONE is not stored in either of them.

Francisco Olarte.

#4Ben Hood
ben@relops.com
In reply to: Francisco Olarte (#3)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 09:59, Francisco Olarte <folarte@peoplecall.com> wrote:

On Thu, May 10, 2018 at 10:03 AM, Ben Hood <ben@relops.com> wrote:
...

Am I missing the point of how Postgres stores TIMEZONE WITH TIME ZONE
internally?

After reading in the follow up TIMEZONE is a typo for TIMESTAMP, yes, you are.

Aproximately, postgres stores ( virtually ) a point in the time line,
for both with and without ts types, same format. Something like a real
number.

The difference is mainly for transforming from/to text ( bear in mind
when you put a constant in a query you are trasnforming from text ).
In the with time zone case it formats/expects it as a time string in
the session configured time zone, in the without case it treats it (
aproximately ) as if it was in utc ( and then discards the "+00" after
formating ).

Maybe I'm confussing you more, its not too easy to explain.

The point is TIMEZONE is not stored in either of them.

Many thanks for clarification, very much appreciated.

Your point is consistent with the manual: “All timezone-aware dates and times are stored internally in UTC”

The subtleties of how a TIMESTAMP WITH TIME ZONE is stored versus how it is rendered by a client or used in calculations and queries have been discussed in numerous places.

On reflection, maybe my question was phrased badly.

The question should not be “how does Postgres store the timestamp internally”.

Rather it should read “is enforcing the submission of UTC denominated timestamps in the server by using a domain a sensible way to enforce a policy that will blow up when apps attempt to use non-UTC timestamps (when they shouldn’t be)”.

So the question is not how does the timestamp get stored, rather, is it an anti-pattern to use Postgres as a linter for apps that forget to use UTC exclusively?

#5Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Ben Hood (#4)
Re: Domain based on TIMEZONE WITH TIME ZONE

On Thu, May 10, 2018 at 11:19:36AM +0100, Ben Hood wrote:

The question should not be “how does Postgres store the timestamp internally”.

Rather it should read “is enforcing the submission of UTC denominated timestamps in the server by using a domain a sensible way to enforce a policy that will blow up when apps attempt to use non-UTC timestamps (when they shouldn’t be)”.

So the question is not how does the timestamp get stored, rather, is it an anti-pattern to use Postgres as a linter for apps that forget to use UTC exclusively?

I dare say it is one of PG's strengths' to be usable as a
"linter".

However, maybe rephrase to:

Is it an anti-pattern to use Postgres as a linter for
apps that forget to use ... timezones *appropriately* ... ?

As long as you can force apps to submit proper timestamp-with-
timezone data is there _really_ a need to care whether apps
do submit in UTC ? After all, it is always converted to UTC
servside anyway ?

In case you want to enforce only ever _handing out_ UTC data
you could wrap the table in a view with forces the output
timezone to UTC and only offers timestamp-withOUT-timezone to
the outside. Then force read access via the view.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#6Ben Hood
ben@relops.com
In reply to: Karsten Hilbert (#5)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 11:36, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:

On Thu, May 10, 2018 at 11:19:36AM +0100, Ben Hood wrote:

I dare say it is one of PG's strengths' to be usable as a
"linter”.

Interesting that you share this view, because after thinking about why I was doing this, using UTC domains is for PG to be a linter.

However, maybe rephrase to:

Is it an anti-pattern to use Postgres as a linter for
apps that forget to use ... timezones *appropriately* ... ?

As long as you can force apps to submit proper timestamp-with-
timezone data is there _really_ a need to care whether apps
do submit in UTC ?

OK, so by using TIMESTAMP WITH TIME ZONE, you force all apps to submit timezone qualified timestamps in what language they are written in.

After all, it is always converted to UTC
servside anyway ?

And because of the internal UTC representation, there is no room for ambiguous timezones.

In case you want to enforce only ever _handing out_ UTC data
you could wrap the table in a view with forces the output
timezone to UTC and only offers timestamp-withOUT-timezone to
the outside. Then force read access via the view.

So on balance there is no need to use a domain for this?

Or are we saying that domains are one way of achieving the timestamp hygiene, but equally, you can get the same result as described above?

Or is there a specific downside to using a domain for this purpose?

…..maybe I’ve just got a subjective prejudice for using domains to refine and tighten built in data types….

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ben Hood (#6)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 05/10/2018 04:31 AM, Ben Hood wrote:

On 10 May 2018, at 11:36, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:

On Thu, May 10, 2018 at 11:19:36AM +0100, Ben Hood wrote:

I dare say it is one of PG's strengths' to be usable as a
"linter”.

Interesting that you share this view, because after thinking about why I was doing this, using UTC domains is for PG to be a linter.

However, maybe rephrase to:

Is it an anti-pattern to use Postgres as a linter for
apps that forget to use ... timezones *appropriately* ... ?

As long as you can force apps to submit proper timestamp-with-
timezone data is there _really_ a need to care whether apps
do submit in UTC ?

OK, so by using TIMESTAMP WITH TIME ZONE, you force all apps to submit timezone qualified timestamps in what language they are written in.

Not really:

https://www.postgresql.org/docs/10/static/datatype-datetime.html

"For timestamp with time zone, the internally stored value is always in
UTC (Universal Coordinated Time, traditionally known as Greenwich Mean
Time, GMT). An input value that has an explicit time zone specified is
converted to UTC using the appropriate offset for that time zone. If no
time zone is stated in the input string, then it is assumed to be in the
time zone indicated by the system's TimeZone parameter, and is converted
to UTC using the offset for the timezone zone."

After all, it is always converted to UTC
servside anyway ?

And because of the internal UTC representation, there is no room for ambiguous timezones.

Define ambiguous timezone?

In case you want to enforce only ever _handing out_ UTC data
you could wrap the table in a view with forces the output
timezone to UTC and only offers timestamp-withOUT-timezone to
the outside. Then force read access via the view.

So on balance there is no need to use a domain for this?

Or are we saying that domains are one way of achieving the timestamp hygiene, but equally, you can get the same result as described above?

Or is there a specific downside to using a domain for this purpose?

…..maybe I’ve just got a subjective prejudice for using domains to refine and tighten built in data types….

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Vick Khera
vivek@khera.org
In reply to: Ben Hood (#6)
Re: Domain based on TIMEZONE WITH TIME ZONE

On Thu, May 10, 2018 at 7:31 AM, Ben Hood <ben@relops.com> wrote:

Or are we saying that domains are one way of achieving the timestamp
hygiene, but equally, you can get the same result as described above?

The *only* way to have timestamp hygiene is to require them to have time
zones at all times, even if that time zone is UTC. Any other representation
of a time is ambiguous without context.

#9Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Adrian Klaver (#7)
Re: Domain based on TIMEZONE WITH TIME ZONE

On Thu, May 10, 2018 at 06:41:04AM -0700, Adrian Klaver wrote:

OK, so by using TIMESTAMP WITH TIME ZONE, you force all apps to submit timezone qualified timestamps in what language they are written in.

Not really:

https://www.postgresql.org/docs/10/static/datatype-datetime.html

"For timestamp with time zone, the internally stored value is always in UTC
(Universal Coordinated Time, traditionally known as Greenwich Mean Time,
GMT). An input value that has an explicit time zone specified is converted
to UTC using the appropriate offset for that time zone. If no time zone is
stated in the input string, then it is assumed to be in the time zone
indicated by the system's TimeZone parameter, and is converted to UTC using
the offset for the timezone zone."

True enough, I didn't remember that behaviour.

And since a BEFORE UPDATE/INSERT trigger will see the data to
be inserted PG will have already done that while parsing from
on-the-wire data into in-memory ts-w-tz presentation so we
can't simply use a trigger to enforce explicit specification
of a timezone.

Therefore, a domain could work but will require client
language support for easy integration.

And because of the internal UTC representation, there is no room for ambiguous timezones.

Define ambiguous timezone?

OP likely means underspecified for his use case (= not
assuming "unspecified" to mean "TimeZone value"). But, then,
OP could always force TimeZone to UTC on his servers :-)

Karsten
--

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ben Hood (#4)
Re: Domain based on TIMEZONE WITH TIME ZONE

Ben Hood <ben@relops.com> writes:

So the question is not how does the timestamp get stored, rather, is it an anti-pattern to use Postgres as a linter for apps that forget to use UTC exclusively?

Well, using a domain to enforce additional constraints on a field's value
is certainly not an anti-pattern in itself. But you have to realize that
the processing consists of first creating a value of the base type and
then applying the constraint expressions of the domain to it. This means
you cannot check any details that are lost in the input conversion,
because you don't have access to the original input string, only the
stored value.

As others have explained, Postgres' TIMESTAMP WITH TIME ZONE type doesn't
preserve the input's timezone specification (if any) but forcibly rotates
to UTC and stores just a scalar UTC value. So you can't use a domain to
check anything about whether the input had a timezone field and if so what
it was.

(This behavior is nonstandard --- the SQL spec evidently expects the
timezone to be stored explicitly in some fashion --- but I don't foresee
us changing it; we've accumulated too much backwards-compatibility
baggage now.)

If you're sufficiently intent on having checking of that sort, you could
invent your own datatype with your own input function, and then make it
binary-compatible with timestamptz so that you don't need to provide much
else besides the I/O functions. varchar(n) has the same sort of
relationship with text, so there's precedent ...

regards, tom lane

#11Francisco Olarte
folarte@peoplecall.com
In reply to: Ben Hood (#4)
Re: Domain based on TIMEZONE WITH TIME ZONE

On Thu, May 10, 2018 at 12:19 PM, Ben Hood <ben@relops.com> wrote:

On 10 May 2018, at 09:59, Francisco Olarte <folarte@peoplecall.com> wrote:

....

Maybe I'm confussing you more, its not too easy to explain.
The point is TIMEZONE is not stored in either of them.

Many thanks for clarification, very much appreciated.

Your point is consistent with the manual: “All timezone-aware dates and times are stored internally in UTC”

Tom lane, which is much more knowledgeable than me, points they are
stored in UTC. Maybe, but, IIRC, they are stored in 8 bytes, I do not
know exactly how, last time I read it that could be either a float8
number or a int8 ( maybe to int4 numbers ). I think they use the same
type of storage as the unix timestamps ( unix timestamp is normally a
signed number of integer seconds from an arbitrary start point,
19700101T000000 UTC, and they designated an instant in time. 7200
designates an instant, I can format it for the reader in many ways,
19700101T020000+0000, "tres de la mañana del primero de enero de mil
novecientos ochenta, hora de Madrid", "1970-01-01 03:00:00 +01" ). But
it is not UTC, it is the designation of an instant in time. Timestamps
do not have time zones. BUT the postgres data types timestamptz is a
way to store a timestamp. So is timestamp. And so is float8. The
difference is how it is converted and interacts with other types.

Rather it should read “is enforcing the submission of UTC denominated timestamps in the server by using a domain a sensible way to enforce a policy that will blow up when apps attempt to use non-UTC timestamps (when they shouldn’t be)”.
So the question is not how does the timestamp get stored, rather, is it an anti-pattern to use Postgres as a linter for apps that forget to use UTC exclusively?

That poses a problem. You must check the input representation. I
mean..., you were using extract on the value, but this happens ( YMMV
):

cdrs=# show timezone;
TimeZone
---------------
Europe/Madrid
(1 row)

cdrs=# select extract(timezone from current_timestamp);
date_part
-----------
7200
(1 row)

cdrs=# set timezone TO 'UTC';
SET
cdrs=# select extract(timezone from current_timestamp);
date_part
-----------
0
(1 row)

cdrs=# select extract(timezone from '2018-01-01
07:00:00+0100'::timestamptz), extract(timezone from '2018-01-01
00:00:00-0300'::timestamptz);
date_part | date_part
-----------+-----------
0 | 0
(1 row)

cdrs=# set timezone to 'Europe/Madrid';
SET
cdrs=# select extract(timezone from '2018-01-01
07:00:00+0100'::timestamptz), extract(timezone from '2018-01-01
00:00:00-0300'::timestamptz);
date_part | date_part
-----------+-----------
3600 | 3600
(1 row)

cdrs=# select extract(timezone from '2018-01-01
07:00:00+0000'::timestamptz), extract(timezone from '2018-07-01
00:00:00+0000'::timestamptz);
date_part | date_part
-----------+-----------
3600 | 7200
(1 row)

As you can see you are getting the seconds offset for the client, may
be in a domain for the server, timezone at the instant in time
designated by the value. Not what you originally typed to locate the
instant in time.

For what you want to do I think you'll have to parse the text value,
maybe by definig a view with a text columns and using some
rule/trigger magic for insert / updates.

It seems you want to force your users to use an explcit time zone.
This may be better handled above the database.

Francisco Olarte.

#12Ben Hood
ben@relops.com
In reply to: Adrian Klaver (#7)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 14:41, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

OK, so by using TIMESTAMP WITH TIME ZONE, you force all apps to submit timezone qualified timestamps in what language they are written in.

Not really:

https://www.postgresql.org/docs/10/static/datatype-datetime.html

"For timestamp with time zone, the internally stored value is always in UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. If no time zone is stated in the input string, then it is assumed to be in the time zone indicated by the system's TimeZone parameter, and is converted to UTC using the offset for the timezone zone.”

Many thanks for this clarification. So therefore you can’t rely on the TIMESTAMP WITH TIME ZONE forcing the app to explicitly specify the offset. This is is because if the app is not specific, then the server will default back to its configured timezone.

So to get deterministic timestamps, you could either:

a) make sure the server is always configured to run in UTC;
b) issue SET TIME ZONE ‘UTC’; at the beginning of any application session

After all, it is always converted to UTC
servside anyway ?

And because of the internal UTC representation, there is no room for ambiguous timezones.

Define ambiguous timezone?

What I meant to say that is there should be no possibility for an effective timezone to arise implicitly.

For example, if you

(1) didn’t control the db server config

and

(2) and you forgot to enforce UTC at a client driver level

and

(3) didn’t set the offset in the app session

Then the only way to know what the effective zone offset will be is to find out what the server default is.

Is this plausible?

#13Ben Hood
ben@relops.com
In reply to: Vick Khera (#8)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 15:12, Vick Khera <vivek@khera.org> wrote:

On Thu, May 10, 2018 at 7:31 AM, Ben Hood <ben@relops.com <mailto:ben@relops.com>> wrote:
Or are we saying that domains are one way of achieving the timestamp hygiene, but equally, you can get the same result as described above?

The *only* way to have timestamp hygiene is to require them to have time zones at all times, even if that time zone is UTC. Any other representation of a time is ambiguous without context.

That makes sense.

The motivation behind narrowing the built in TIMESTAMP WITH TIME ZONE down to a domain is to ensure the only permissible zone offset is UTC. This would be unambiguous.

#14Ben Hood
ben@relops.com
In reply to: Karsten Hilbert (#9)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 15:17, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:

Not really:

https://www.postgresql.org/docs/10/static/datatype-datetime.html

"For timestamp with time zone, the internally stored value is always in UTC
(Universal Coordinated Time, traditionally known as Greenwich Mean Time,
GMT). An input value that has an explicit time zone specified is converted
to UTC using the appropriate offset for that time zone. If no time zone is
stated in the input string, then it is assumed to be in the time zone
indicated by the system's TimeZone parameter, and is converted to UTC using
the offset for the timezone zone."

True enough, I didn't remember that behaviour.

And since a BEFORE UPDATE/INSERT trigger will see the data to
be inserted PG will have already done that while parsing from
on-the-wire data into in-memory ts-w-tz presentation so we
can't simply use a trigger to enforce explicit specification
of a timezone.

Therefore, a domain could work but will require client
language support for easy integration.

For reference, the domain based solution has been working for a while.

It has fleshed out bugs in apps that failed to set the zone either on the session level at the field level.

In those circumstances, the server raised an exception to say that a non-UTC timestamp was submitted.

It could have been the timestamp had a specific zone other than UTC. In this case, the behavior would be that Postgres stores it as UTC and therefore all is good.

But it also would be that a timestamp WITHOUT a zone was submitted, in which case, the server default would take precedence.

And because of the internal UTC representation, there is no room for ambiguous timezones.

Define ambiguous timezone?

OP likely means underspecified for his use case (= not
assuming "unspecified" to mean "TimeZone value"). But, then,
OP could always force TimeZone to UTC on his servers :-)

That is what the OP meant. That the zone value was not explicit in all cases. For example, if you fallback to the server default.

And yes, the OP could have set UTC both on the DB servers (assuming OP controls them) and within each server app (assuming OP controls them).

#15David G. Johnston
david.g.johnston@gmail.com
In reply to: Ben Hood (#13)
Re: Domain based on TIMEZONE WITH TIME ZONE

On Thu, May 10, 2018 at 9:13 AM, Ben Hood <ben@relops.com> wrote:

On 10 May 2018, at 15:12, Vick Khera <vivek@khera.org> wrote:

On Thu, May 10, 2018 at 7:31 AM, Ben Hood <ben@relops.com> wrote:

Or are we saying that domains are one way of achieving the timestamp
hygiene, but equally, you can get the same result as described above?

The *only* way to have timestamp hygiene is to require them to have time
zones at all times, even if that time zone is UTC. Any other representation
of a time is ambiguous without context.

That makes sense.

The motivation behind narrowing the built in TIMESTAMP WITH TIME ZONE down
to a domain is to ensure the only permissible zone offset is UTC. This
would be unambiguous.

​'2018-05-10T15:23:00-07:00​'::timestamptz is unambiguous

Allowing client applications to represent time in the user's timezone is a
feature.

​"""Ben
So to get deterministic timestamps, you could either:

a) make sure the server is always configured to run in UTC;
b) issue SET TIME ZONE ‘UTC’; at the beginning of any application session
"""

No

If I send 4pm ET to the server to be stored in a timestamptz field, and
fail to tell the server that the timezone is ET in the value itself then I
must instead set my session timezone to ET or the server is going to store
the wrong value. There is nothing you can do in an default server to
prevent this. Tom has described how you could possibly make the "fail to
tell the server that the timezone is ET" impossible using a custom type.
This seems to be what you want though I'd question whether it is worth the
cost.

I'm not sure how binary timestamp values being sent to the server in a BIND
command plays into this...

David J.

#16Ben Hood
ben@relops.com
In reply to: Tom Lane (#10)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 15:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ben Hood <ben@relops.com> writes:

So the question is not how does the timestamp get stored, rather, is it an anti-pattern to use Postgres as a linter for apps that forget to use UTC exclusively?

Well, using a domain to enforce additional constraints on a field's value
is certainly not an anti-pattern in itself. But you have to realize that
the processing consists of first creating a value of the base type and
then applying the constraint expressions of the domain to it. This means
you cannot check any details that are lost in the input conversion,
because you don't have access to the original input string, only the
stored value.

Fair point.

In the case of the domain based on TIMESTAMP WITH TIME ZONE, if you are mandating UTC, what further details could get lost that may have been in the original input string?

The semantics are that inserting into a column of this domain with a non-UTC or absent zone will be rejected and hence is data you don’t want in the database in any case.

As others have explained, Postgres' TIMESTAMP WITH TIME ZONE type doesn't
preserve the input's timezone specification (if any) but forcibly rotates
to UTC and stores just a scalar UTC value. So you can't use a domain to
check anything about whether the input had a timezone field and if so what
it was.

OK, point taken. The intention of the domain is prevent any non-UTC or absent zones, so do you need to check anything after you have inserted it?

(This behavior is nonstandard --- the SQL spec evidently expects the
timezone to be stored explicitly in some fashion --- but I don't foresee
us changing it; we've accumulated too much backwards-compatibility
baggage now.)

If you're sufficiently intent on having checking of that sort, you could
invent your own datatype with your own input function, and then make it
binary-compatible with timestamptz so that you don't need to provide much
else besides the I/O functions. varchar(n) has the same sort of
relationship with text, so there's precedent …

Many thanks for your feedback, much appreciated. What I hear you saying is that is possible and similar things have been done before. But by the same token, just because you can do it, doesn’t necssarily mean it buys you that much.

So if a person of your experience hasn’t come across something like this before, it might mean that on balance it’s not worth the effort and/or potential deviation from standard DB tooling. Be pragmatic.

As an aside, this whole discussion has been super valuable for me to get a better understanding of exactly what is going on. This is precisely the intention of the domain in the first place - to be explicit about timestamps everywhere.

So if I get timestamp UTC explicitness everywhere in my database by actually understanding what is going on as opposed to using a domain, then I have achieved my goal.

I appreciate everybody chiming in on this topic :-)

#17Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ben Hood (#12)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 05/10/2018 09:09 AM, Ben Hood wrote:

On 10 May 2018, at 14:41, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

OK, so by using TIMESTAMP WITH TIME ZONE, you force all apps to submit timezone qualified timestamps in what language they are written in.

Not really:

https://www.postgresql.org/docs/10/static/datatype-datetime.html

"For timestamp with time zone, the internally stored value is always in UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. If no time zone is stated in the input string, then it is assumed to be in the time zone indicated by the system's TimeZone parameter, and is converted to UTC using the offset for the timezone zone.”

Many thanks for this clarification. So therefore you can’t rely on the TIMESTAMP WITH TIME ZONE forcing the app to explicitly specify the offset. This is is because if the app is not specific, then the server will default back to its configured timezone.

So to get deterministic timestamps, you could either:

a) make sure the server is always configured to run in UTC;
b) issue SET TIME ZONE ‘UTC’; at the beginning of any application session

Well if you are using a timestamp with timezone field the value is
always going to be stored as UTC. The TimeZone setting just determines
the rotation from the input value to the stored value and the reverse.
My previous point was just that Postgres will not enforce an offset on
input data.

After all, it is always converted to UTC
servside anyway ?

And because of the internal UTC representation, there is no room for ambiguous timezones.

Define ambiguous timezone?

What I meant to say that is there should be no possibility for an effective timezone to arise implicitly.

For example, if you

(1) didn’t control the db server config

and

(2) and you forgot to enforce UTC at a client driver level

and

(3) didn’t set the offset in the app session

Then the only way to know what the effective zone offset will be is to find out what the server default is.

Is this plausible?

If you mean find the server default then yes:

test_(aklaver)> select current_setting('TimeZone');
current_setting
-----------------
US/Pacific

https://www.postgresql.org/docs/10/static/functions-admin.html#FUNCTIONS-ADMIN-SET

--
Adrian Klaver
adrian.klaver@aklaver.com

#18Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ben Hood (#13)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 05/10/2018 09:13 AM, Ben Hood wrote:

On 10 May 2018, at 15:12, Vick Khera <vivek@khera.org
<mailto:vivek@khera.org>> wrote:

On Thu, May 10, 2018 at 7:31 AM, Ben Hood <ben@relops.com
<mailto:ben@relops.com>> wrote:

Or are we saying that domains are one way of achieving the
timestamp hygiene, but equally, you can get the same result as
described above?

The *only* way to have timestamp hygiene is to require them to have
time zones at all times, even if that time zone is UTC. Any other
representation of a time is ambiguous without context.

That makes sense.

The motivation behind narrowing the built in TIMESTAMP WITH TIME ZONE
down to a domain is to ensure the only permissible zone offset is UTC.
This would be unambiguous.

Per my previous post a timestamp with timezone is going to be stored as
UTC, so there is no ambiguity there. On reflection I realized your
concern maybe with determining the original input timezone. That
information is not stored by Postgres, so there can be ambiguity as to
their value. Doing all timestamps in UTC is one one way to eliminate
this. The potential issue I see is that you now push the ambiguity into
the app. Namely just from looking at the database values you still do
not know what the original timezone the app lives in is.

--
Adrian Klaver
adrian.klaver@aklaver.com

#19Ben Hood
ben@relops.com
In reply to: Adrian Klaver (#18)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 18:29, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

Per my previous post a timestamp with timezone is going to be stored as UTC, so there is no ambiguity there. On reflection I realized your concern maybe with determining the original input timezone. That information is not stored by Postgres, so there can be ambiguity as to their value. Doing all timestamps in UTC is one one way to eliminate this. The potential issue I see is that you now push the ambiguity into the app. Namely just from looking at the database values you still do not know what the original timezone the app lives in is.

That’s very true, I hadn’t thought of that.

The use case I have is a schema that unites billing events from disparate telecoms systems from a bunch of different carriers. The source of the non-specific timestamps is party input data files that provide different local timestamps for systems in different systems, but also different apps that submit billing events. So there are many inconsistent feeds into the database.

So when we push the ambiguity into the app, at least what is happening is the the transaction is rejected which means the app breaks. When the app breaks, we can go in and fix the timestamp bug in the particular app. Often this is difficult, and we need to reason about the source data, but the breakage at least tells us that something is wrong. Otherwise we ingress the data, do complex billing queries and then the only time we find out about a bug is when a customer complains about a particular bill. When this happens, it is really difficult to determine whether there is a bug in the query logic or if the input is bogus.

So in this sense, the database is linting the the source data.

#20Ben Hood
ben@relops.com
In reply to: David G. Johnston (#15)
Re: Domain based on TIMEZONE WITH TIME ZONE

On 10 May 2018, at 17:35, David G. Johnston <david.g.johnston@gmail.com> wrote:

​'2018-05-10T15:23:00-07:00​'::timestamptz is unambiguous

That is true. Mandating UTC is not the only way to eliminate ambiguity. Apologies for appearing to suggest that this is case.

Allowing client applications to represent time in the user's timezone is a feature.

Fair point.

If I send 4pm ET to the server to be stored in a timestamptz field, and fail to tell the server that the timezone is ET in the value itself then I must instead set my session timezone to ET or the server is going to store the wrong value. There is nothing you can do in an default server to prevent this. Tom has described how you could possibly make the "fail to tell the server that the timezone is ET" impossible using a custom type. This seems to be what you want though I'd question whether it is worth the cost.

The domain approach is what has been implemented and has been running in production for some time without a problem (that we know of).

The cost of this appears to be quite trivial in Postgres - there are 2 lines of DDL in the OP that describe the implementation and it seems to work.

The reason why I started this thread is because somebody on a JDBC related list told me that I was crazy for trying to do this and that I had most likely misunderstood how Postgres works. So respecting their opinion, I decided to ask the Postgres experts.

It is heartening to hear Tom say that the idea is not completely insane, but questions the business value of mandating UTC, given there are other ways to ensure timezone explicitness across the app(s) and the database.

I’m also somewhat wiser about Postgres works now as well.

I'm not sure how binary timestamp values being sent to the server in a BIND command plays into this...

The bit I’m ignorant of is that the domain approach appears to work with the timestampz wire representation. Apps seem to bind using the platform timestampz representation and the server barfs if the app forgets to enforce UTC.

#21Ben Hood
ben@relops.com
In reply to: Adrian Klaver (#17)
#22Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Ben Hood (#21)
#23Ben Hood
ben@relops.com
In reply to: Francisco Olarte (#11)
#24Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Ben Hood (#23)
#25Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Peter J. Holzer (#24)
#26Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ben Hood (#23)
#27Ben Hood
ben@relops.com
In reply to: Peter J. Holzer (#24)
#28Ben Hood
ben@relops.com
In reply to: Adrian Klaver (#26)
#29Francisco Olarte
folarte@peoplecall.com
In reply to: Ben Hood (#23)
#30Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Francisco Olarte (#29)
#31Francisco Olarte
folarte@peoplecall.com
In reply to: Adrian Klaver (#30)
#32Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Francisco Olarte (#31)
#33Francisco Olarte
folarte@peoplecall.com
In reply to: Adrian Klaver (#32)
#34Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Francisco Olarte (#33)
#35Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Adrian Klaver (#34)
#36Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Peter J. Holzer (#35)