funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

Started by Josh Kupershmidtover 15 years ago11 messagesgeneral
Jump to latest
#1Josh Kupershmidt
schmiddy@gmail.com

Hi all,

I've come across a puzzling situation with a table having a timestamp
with time zone column. This column is full of values displaying
exactly as '1999-12-31 19:00:00-05', but for some reason Postgres is
treating some of these identical-seeming timestamps as being
different.

If I update all these timestamps by adding an interval of '1 DAYS' to
all rows, Postgres recognizes all the values as being the same. If I
repeat this experiment using a timestamp without time zone type,
Postgres recognizes all the timestamps as being the same.

When I pg_dump the timestamps_test table, I see a normal-looking dump:
COPY timestamps_test (ts) FROM stdin;
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05

and when I reload this pg_dump file back into the same database,
Postgres again recognizes that all the timestamps are the same (i.e.
SELECT COUNT(DISTINCT(ts)) returns 1). I've attached a plain-text
pg_dump of this table.

Here's a log of how I created this timestamps_test table, from a
source table full of these '1999-12-31 19:00:00-05' timestamps. Any
ideas what might be causing this?

test=# CREATE TABLE timestamps_test (ts timestamp with time zone NOT NULL);
CREATE TABLE
test=# INSERT INTO timestamps_test (ts) SELECT DISTINCT(updated) FROM
myschema.strange_table;
INSERT 0 119
test=# SELECT COUNT(DISTINCT(ts)) FROM timestamps_test;
count
-------
119
(1 row)

test=# SELECT DISTINCT(ts) FROM timestamps_test LIMIT 10;
ts
------------------------
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
(10 rows)

test=# SELECT MAX(ts) = MIN(ts), MAX(ts) - MIN(ts) FROM timestamps_test;
?column? | ?column?
----------+----------
f | 00:00:00
(1 row)

test=# UPDATE timestamps_test SET ts = ts + INTERVAL '1 DAYS';
UPDATE 119
test=# SELECT COUNT(DISTINCT(ts)) FROM timestamps_test;
count
-------
1
(1 row)

test=# SELECT DISTINCT(ts) FROM timestamps_test LIMIT 10;
ts
------------------------
2000-01-01 19:00:00-05
(1 row)

test=# SELECT version();
version

--------------------------------------------------------------------------------
-----------------------------------
PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 2
0080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

test=# SELECT name, setting FROM pg_settings WHERE name IN
('TimeZone', 'lc_collate', 'lc_ctype', 'lc_time', 'DateStyle');
name | setting
------------+------------
DateStyle | ISO, MDY
lc_collate | C
lc_ctype | C
lc_time | C
TimeZone | US/Eastern
(5 rows)

Thanks for any ideas,
Josh

Attachments:

timestamps_test.sqltext/x-sql; charset=US-ASCII; name=timestamps_test.sqlDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Kupershmidt (#1)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

Josh Kupershmidt <schmiddy@gmail.com> writes:

I've come across a puzzling situation with a table having a timestamp
with time zone column. This column is full of values displaying
exactly as '1999-12-31 19:00:00-05', but for some reason Postgres is
treating some of these identical-seeming timestamps as being
different.

Is this installation using float or integer timestamps? If the former,
it might be interesting to look at the subtraction result
ts - '1999-12-31 19:00:00-05'::timestamptz
I'm thinking some of them might be different by submicrosecond amounts.

regards, tom lane

#3Josh Kupershmidt
schmiddy@gmail.com
In reply to: Tom Lane (#2)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

On Fri, Sep 3, 2010 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Is this installation using float or integer timestamps?  If the former,
it might be interesting to look at the subtraction result
       ts - '1999-12-31 19:00:00-05'::timestamptz
I'm thinking some of them might be different by submicrosecond amounts.

Ah yes, this is likely why. pg_config says CONFIGURE = ...
'--disable-integer-datetimes' ...

But I'm having trouble seeing for sure whether there are
submicrosecond parts of these timestamps. I just see a bunch of
'00:00:00' values with your query:

test=# SELECT ts - '1999-12-31 19:00:00-05'::timestamptz FROM
timestamps_test LIMIT 5;
?column?
----------
00:00:00
00:00:00
00:00:00
00:00:00
00:00:00
(5 rows)

And SELECT EXTRACT(microseconds FROM ts) FROM timestamps_test also
just gives me zeroes. Is there a way for me to see for sure?

Josh

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Kupershmidt (#3)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

Josh Kupershmidt <schmiddy@gmail.com> writes:

On Fri, Sep 3, 2010 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm thinking some of them might be different by submicrosecond amounts.

Ah yes, this is likely why. pg_config says CONFIGURE = ...
'--disable-integer-datetimes' ...

But I'm having trouble seeing for sure whether there are
submicrosecond parts of these timestamps.

Experimenting, I can do this:

regression=# create table t1 (ts timestamptz);
CREATE TABLE
regression=# insert into t1 select '1999-12-31 19:00:00.0000001-05'::timestamptz;
INSERT 0 1
regression=# insert into t1 select '1999-12-31 19:00:00.000000-05'::timestamptz;
INSERT 0 1
regression=# select * from t1;
ts
------------------------
1999-12-31 19:00:00-05
1999-12-31 19:00:00-05
(2 rows)

regression=# select extract(epoch from ts - '1999-12-31 19:00:00-05'::timestamptz) from t1;
date_part
----------------------
1.00000761449337e-07
0
(2 rows)

This timestamp (2000-01-01 00:00 GMT) is actually the zero value
internally for Postgres timestamps, so in principle a float timestamp
has precision far smaller than microseconds for values near this.
We don't make any great effort to expose that though. It looks like
the closest value that timestamptzin makes different from zero is

regression=# select extract(epoch from '1999-12-31 19:00:00.00000000001-05' - '1999-12-31 19:00:00-05'::timestamptz) ;
date_part
----------------------
1.45519152283669e-11
(1 row)

regards, tom lane

#5Josh Kupershmidt
schmiddy@gmail.com
In reply to: Tom Lane (#4)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

On Fri, Sep 3, 2010 at 3:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

regression=# select extract(epoch from ts - '1999-12-31 19:00:00-05'::timestamptz) from t1;
     date_part
----------------------
 1.00000761449337e-07
                   0
(2 rows)

This timestamp (2000-01-01 00:00 GMT) is actually the zero value
internally for Postgres timestamps, so in principle a float timestamp
has precision far smaller than microseconds for values near this.
We don't make any great effort to expose that though.  It looks like
the closest value that timestamptzin makes different from zero is

regression=# select extract(epoch from '1999-12-31 19:00:00.00000000001-05' - '1999-12-31 19:00:00-05'::timestamptz) ;
     date_part
----------------------
 1.45519152283669e-11
(1 row)

EXTRACT(epoch ...) was what I was looking for:

SELECT EXTRACT(epoch FROM ts - '1999-12-31 19:00:00-05'::timestamptz)
FROM timestamps_test LIMIT 5;
date_part
-----------------------
1.4120666068199e-309
1.4154982781624e-309
1.41550281692099e-309
1.41591466059161e-309
1.41591524669472e-309
(5 rows)

Thanks for the help, Tom.

Josh

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#4)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

[ trivia warning ]

I wrote:

We don't make any great effort to expose that though. It looks like
the closest value that timestamptzin makes different from zero is

regression=# select extract(epoch from '1999-12-31 19:00:00.00000000001-05' - '1999-12-31 19:00:00-05'::timestamptz) ;
date_part
----------------------
1.45519152283669e-11
(1 row)

Actually, it looks like the precision is being limited by the rotation
from EST zone. In GMT zone I can do this:

regression=# select extract(epoch from '2000-01-01 00:00:00.0000000000000000000000000000000000000000000000000000000000000000001'::timestamptz - '2000-01-01 00:00:00');
date_part
-----------
1e-67
(1 row)

and it could go a lot smaller except there's an arbitrary limit on
the length of input string that timestamptzin will take.

If float timestamps weren't deprecated it might be worth trying to make
this behave less surprisingly.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Kupershmidt (#5)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

Josh Kupershmidt <schmiddy@gmail.com> writes:

EXTRACT(epoch ...) was what I was looking for:

SELECT EXTRACT(epoch FROM ts - '1999-12-31 19:00:00-05'::timestamptz)
FROM timestamps_test LIMIT 5;
date_part
-----------------------
1.4120666068199e-309
1.4154982781624e-309
1.41550281692099e-309
1.41591466059161e-309
1.41591524669472e-309
(5 rows)

Wow. You must have gotten those with the help of some arithmetic,
because timestamptzin would never have produced them. I found out I can
do

regression=# select extract(epoch from ('2000-01-01 00:00:00'::timestamptz + '0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001'::interval) - '2000-01-01 00:00:00');
date_part
-----------
1e-209
(1 row)

but I wonder what it was you actually did.

regards, tom lane

#8Josh Kupershmidt
schmiddy@gmail.com
In reply to: Tom Lane (#7)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

On Fri, Sep 3, 2010 at 3:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Wow.  You must have gotten those with the help of some arithmetic,
because timestamptzin would never have produced them.  I found out I can
do

regression=# select extract(epoch from ('2000-01-01 00:00:00'::timestamptz + '0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001'::interval) - '2000-01-01 00:00:00');
 date_part
-----------
   1e-209
(1 row)

but I wonder what it was you actually did.

I wonder myself :-) I encountered these timestamps while going through
some C code I inherited which uses libpq to load several tables (such
as myschema.strange_table in the original example) using COPY FROM
STDIN. I don't think any timestamp arithmetic was involved. The code
was supposed to copy in legitimate timestamps, but instead loaded all
these '1999-12-31 19:00:00-05' values, and I'm still trying to figure
out how/why.

Josh

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Kupershmidt (#8)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

Josh Kupershmidt <schmiddy@gmail.com> writes:

On Fri, Sep 3, 2010 at 3:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

but I wonder what it was you actually did.

I wonder myself :-) I encountered these timestamps while going through
some C code I inherited which uses libpq to load several tables (such
as myschema.strange_table in the original example) using COPY FROM
STDIN. I don't think any timestamp arithmetic was involved. The code
was supposed to copy in legitimate timestamps, but instead loaded all
these '1999-12-31 19:00:00-05' values, and I'm still trying to figure
out how/why.

Interesting. I can't imagine how you could have produced these with
plain COPY, since that would go through timestamptzin. Was it by any
chance a binary COPY? If so I could believe that funny timestamps could
get in. Maybe some confusion over endianness of the binary data, for
instance.

regards, tom lane

#10Josh Kupershmidt
schmiddy@gmail.com
In reply to: Tom Lane (#9)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

On Fri, Sep 3, 2010 at 4:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Interesting.  I can't imagine how you could have produced these with
plain COPY, since that would go through timestamptzin.  Was it by any
chance a binary COPY?  If so I could believe that funny timestamps could
get in.  Maybe some confusion over endianness of the binary data, for
instance.

Exactly, the code is using COPY ... TO STDOUT WITH BINARY along with
COPY ... FROM STDIN WITH BINARY.

Josh

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Kupershmidt (#10)
Re: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone

Josh Kupershmidt <schmiddy@gmail.com> writes:

On Fri, Sep 3, 2010 at 4:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Interesting. �I can't imagine how you could have produced these with
plain COPY, since that would go through timestamptzin. �Was it by any
chance a binary COPY? �If so I could believe that funny timestamps could
get in. �Maybe some confusion over endianness of the binary data, for
instance.

Exactly, the code is using COPY ... TO STDOUT WITH BINARY along with
COPY ... FROM STDIN WITH BINARY.

OK; what you need to look at is how the client code is preparing the
timestamp values. What they should be is floats representing seconds
since 2000-01-01 00:00 GMT, sent in bigendian byte order.

regards, tom lane