BUG #15551: Date/Time comparison not correct when the comparison is inside join clause and involves "+" or "-"

Started by PG Bug reporting formover 7 years ago2 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 15551
Logged by: Lei He
Email address: whereverlei@gmail.com
PostgreSQL version: 11.1
Operating system: macOS Mojave 10.14
Description:

dil=# select version();
version

------------------------------------------------------------------------------------------------------------------
PostgreSQL 11.1 on x86_64-apple-darwin18.2.0, compiled by Apple LLVM
version 10.0.0 (clang-1000.11.45.5), 64-bit
(1 row)

I want to compute growth of something quarterly. Here is the simplified
data:

with code_time as (
select code, "time" from (values
('000001', date '2016-12-31'),
('000001', date '2017-03-31'),
('000001', date '2017-06-30'),
('000001', date '2017-09-30'),
('000001', date '2017-12-31'),
('000001', date '2018-03-31'),
('000001', date '2018-06-30'),
('000001', date '2018-09-30'),
('000001', date '2018-12-31'),
('000002', date '2016-12-31'),
('000002', date '2017-03-31'),
('000002', date '2017-06-30'),
('000002', date '2017-09-30'),
('000002', date '2017-12-31'),
('000002', date '2018-03-31'),
('000002', date '2018-06-30'),
('000002', date '2018-09-30'),
('000002', date '2018-12-31')
) t("code", "time")
)

The result I need is like this:
code | cur_quarter | pre_quarter
--------+-------------+-------------
000001 | 2017-03-31 | 2016-12-31
000001 | 2017-06-30 | 2017-03-31
000001 | 2017-09-30 | 2017-06-30
000001 | 2017-12-31 | 2017-09-30
000001 | 2018-03-31 | 2017-12-31
000001 | 2018-06-30 | 2018-03-31
000001 | 2018-09-30 | 2018-06-30
000001 | 2018-12-31 | 2018-09-30
000002 | 2017-03-31 | 2016-12-31
000002 | 2017-06-30 | 2017-03-31
000002 | 2017-09-30 | 2017-06-30
000002 | 2017-12-31 | 2017-09-30
000002 | 2018-03-31 | 2017-12-31
000002 | 2018-06-30 | 2018-03-31
000002 | 2018-09-30 | 2018-06-30
000002 | 2018-12-31 | 2018-09-30

If I use query
select ct1.code, ct1."time" cur_quarter, ct2."time" pre_quarter from
code_time ct1
join (select code, "time" from code_time) ct2 on ct1.code = ct2.code and
ct1."time" = ct2."time" + interval '3 months'
order by ct1.code, ct1."time";

I got:
code | cur_quarter | pre_quarter
--------+-------------+-------------
000001 | 2017-03-31 | 2016-12-31
000001 | 2017-06-30 | 2017-03-31
000001 | 2017-09-30 | 2017-06-30
000001 | 2018-03-31 | 2017-12-31
000001 | 2018-06-30 | 2018-03-31
000001 | 2018-09-30 | 2018-06-30
000002 | 2017-03-31 | 2016-12-31
000002 | 2017-06-30 | 2017-03-31
000002 | 2017-09-30 | 2017-06-30
000002 | 2018-03-31 | 2017-12-31
000002 | 2018-06-30 | 2018-03-31
000002 | 2018-09-30 | 2018-06-30

Note: the fourth quarter (*-12-31) of all years are missing.

If I use query
select ct1.code, ct1."time" cur_quarter, ct2."time" pre_quarter from
code_time ct1
join (select code, "time" from code_time) ct2 on ct1.code = ct2.code and
ct2."time" = ct1."time" - interval '3 months'
order by ct1.code, ct1."time";

I got:
code | cur_quarter | pre_quarter
--------+-------------+-------------
000001 | 2017-03-31 | 2016-12-31
000001 | 2017-09-30 | 2017-06-30
000001 | 2017-12-31 | 2017-09-30
000001 | 2018-03-31 | 2017-12-31
000001 | 2018-09-30 | 2018-06-30
000001 | 2018-12-31 | 2018-09-30
000002 | 2017-03-31 | 2016-12-31
000002 | 2017-09-30 | 2017-06-30
000002 | 2017-12-31 | 2017-09-30
000002 | 2018-03-31 | 2017-12-31
000002 | 2018-09-30 | 2018-06-30
000002 | 2018-12-31 | 2018-09-30

Note: the second quarter (*-06-30) of all years are missing.

Only if I use query
select ct1.code, ct1."time" cur_quarter, ct2."time" pre_quarter from
code_time ct1
join (select code, "time" from code_time) ct2 on ct1.code = ct2.code and (
ct1."time" = ct2."time" + interval '3 months' or ct2."time" = ct1."time" -
interval '3 months' )
order by ct1.code, ct1."time";

can I get the expected result:
code | cur_quarter | pre_quarter
--------+-------------+-------------
000001 | 2017-03-31 | 2016-12-31
000001 | 2017-06-30 | 2017-03-31
000001 | 2017-09-30 | 2017-06-30
000001 | 2017-12-31 | 2017-09-30
000001 | 2018-03-31 | 2017-12-31
000001 | 2018-06-30 | 2018-03-31
000001 | 2018-09-30 | 2018-06-30
000001 | 2018-12-31 | 2018-09-30
000002 | 2017-03-31 | 2016-12-31
000002 | 2017-06-30 | 2017-03-31
000002 | 2017-09-30 | 2017-06-30
000002 | 2017-12-31 | 2017-09-30
000002 | 2018-03-31 | 2017-12-31
000002 | 2018-06-30 | 2018-03-31
000002 | 2018-09-30 | 2018-06-30
000002 | 2018-12-31 | 2018-09-30

The only difference among these query lies in the join clause involving date
/ time comparison having date / time operators being used:
1st: ct1."time" = ct2."time" + interval '3 months'
2nd: ct2."time" = ct1."time" - interval '3 months'
3rd: ct1."time" = ct2."time" + interval '3 months' or ct2."time" =
ct1."time" - interval '3 months'

It seems to me all the 3 conditions are logically same and should have the
same result.

#2Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: PG Bug reporting form (#1)
Re: BUG #15551: Date/Time comparison not correct when the comparison is inside join clause and involves "+" or "-"

"PG" == PG Bug reporting form <noreply@postgresql.org> writes:

PG> Note: the fourth quarter (*-12-31) of all years are missing.

So the problem here is that you're trying to identify quarters by the
_last_ date, not by the _first_ date.

If you add 3 months to Sept 30th, you get Dec 30th, NOT Dec 31st.
It so happens that adding 3 months to Mar 31st gives Jun 30th, and
adding 3 months to Jun 30th gives Sep 30th, and adding 3 months to
Dec 31st gives Mar 31st, so it's only the one case that fails here.

But if you identify the quarter by its _first_ date, you have no
problem.

date_trunc('quarter', somedate::timestamp) can help with this, but
remember to cast the date to timestamp (without time zone) in the call,
otherwise you'll get incorrect results due to timezone issues.

PG> The only difference among these query lies in the join clause
PG> involving date / time comparison having date / time operators being
PG> used:

PG> 1st: ct1."time" = ct2."time" + interval '3 months'
PG> 2nd: ct2."time" = ct1."time" - interval '3 months'
PG> 3rd: ct1."time" = ct2."time" + interval '3 months' or ct2."time" =
PG> ct1."time" - interval '3 months'

PG> It seems to me all the 3 conditions are logically same and should
PG> have the same result.

Well, in the presence of months of variable lengths, they clearly are
not equivalent and cannot be. So no bug here.

--
Andrew (irc:RhodiumToad)