Fix overflow in DecodeInterval
The attached patch fixes an overflow bug in DecodeInterval when applying
the units week, decade, century, and millennium. The overflow check logic
was modelled after the overflow check at the beginning of `int
tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in timestamp.c.
This is my first patch, so apologies if I screwed up the process somewhere.
- Joe Koshakow
Attachments:
0001-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=0001-Check-for-overflow-when-decoding-an-interval.patchDownload+64-5
Joseph Koshakow <koshy44@gmail.com> writes:
The attached patch fixes an overflow bug in DecodeInterval when applying
the units week, decade, century, and millennium. The overflow check logic
was modelled after the overflow check at the beginning of `int
tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in timestamp.c.
Good catch, but I don't think that tm2interval code is best practice
anymore. Rather than bringing "double" arithmetic into the mix,
you should use the overflow-detecting arithmetic functions in
src/include/common/int.h. The existing code here is also pretty
faulty in that it doesn't notice addition overflow when combining
multiple units. So for example, instead of
tm->tm_mday += val * 7;
I think we should write something like
if (pg_mul_s32_overflow(val, 7, &tmp))
return DTERR_FIELD_OVERFLOW;
if (pg_add_s32_overflow(tm->tm_mday, tmp, &tm->tm_mday))
return DTERR_FIELD_OVERFLOW;
Perhaps some macros could be used to make this more legible?
regards, tom lane
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
Joseph Koshakow <koshy44(at)gmail(dot)com> writes:
The attached patch fixes an overflow bug in DecodeInterval when applying
the units week, decade, century, and millennium. The overflow check logic
was modelled after the overflow check at the beginning of `int
tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in timestamp.c.Good catch, but I don't think that tm2interval code is best practice
anymore. Rather than bringing "double" arithmetic into the mix,
you should use the overflow-detecting arithmetic functions in
src/include/common/int.h. The existing code here is also pretty
faulty in that it doesn't notice addition overflow when combining
multiple units. So for example, instead oftm->tm_mday += val * 7;
I think we should write something like
if (pg_mul_s32_overflow(val, 7, &tmp))
return DTERR_FIELD_OVERFLOW;
if (pg_add_s32_overflow(tm->tm_mday, tmp, &tm->tm_mday))
return DTERR_FIELD_OVERFLOW;Perhaps some macros could be used to make this more legible?
regards, tom lane
@postgresql
Thanks for the feedback Tom, I've attached an updated patch with
your suggestions. Feel free to rename the horribly named macro.
Also while fixing this I noticed that fractional intervals can also
cause an overflow issue.
postgres=# SELECT INTERVAL '0.1 months 2147483647 days';
interval
------------------
-2147483646 days
(1 row)
I haven't looked into it, but it's probably a similar cause.
Attachments:
0002-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=0002-Check-for-overflow-when-decoding-an-interval.patchDownload+58-5
On Fri, Feb 11, 2022 at 4:58 PM Joseph Koshakow <koshy44@gmail.com> wrote:
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
Joseph Koshakow <koshy44(at)gmail(dot)com> writes:
The attached patch fixes an overflow bug in DecodeInterval when applying
the units week, decade, century, and millennium. The overflow check logic
was modelled after the overflow check at the beginning of `int
tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in timestamp.c.Good catch, but I don't think that tm2interval code is best practice
anymore. Rather than bringing "double" arithmetic into the mix,
you should use the overflow-detecting arithmetic functions in
src/include/common/int.h. The existing code here is also pretty
faulty in that it doesn't notice addition overflow when combining
multiple units. So for example, instead oftm->tm_mday += val * 7;
I think we should write something like
if (pg_mul_s32_overflow(val, 7, &tmp))
return DTERR_FIELD_OVERFLOW;
if (pg_add_s32_overflow(tm->tm_mday, tmp, &tm->tm_mday))
return DTERR_FIELD_OVERFLOW;Perhaps some macros could be used to make this more legible?
regards, tom lane
@postgresql
Thanks for the feedback Tom, I've attached an updated patch with
your suggestions. Feel free to rename the horribly named macro.Also while fixing this I noticed that fractional intervals can also
cause an overflow issue.
postgres=# SELECT INTERVAL '0.1 months 2147483647 days';
interval
------------------
-2147483646 days
(1 row)
I haven't looked into it, but it's probably a similar cause.
Hey Tom,
I was originally going to fix the fractional issue in a follow-up
patch, but I took a look at it and decided to make a new patch
with both fixes. I tried to make the handling of fractional and
non-fractional units consistent. I've attached the patch below.
While investigating this, I've noticed a couple more overflow
issues with Intervals, but I think they're best handled in separate
patches. I've included the ones I've found in this email.
postgres=# CREATE TEMP TABLE INTERVAL_TBL_OF (f1 interval);
CREATE TABLE
postgres=# INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('0.1 weeks
2147483647 hrs');
INSERT 0 1
postgres=# SELECT * FROM INTERVAL_TBL_OF;
ERROR: interval out of range
postgres=# SELECT justify_days(INTERVAL '2147483647 months 2147483647 days');
justify_days
-----------------------------------
-172991738 years -4 mons -23 days
(1 row)
Cheers,
Joe Koshakow
Attachments:
0003-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=0003-Check-for-overflow-when-decoding-an-interval.patchDownload+152-23
Hi,
On 2022-02-12 21:16:05 -0500, Joseph Koshakow wrote:
I've attached the patch below.
Any reason for using int return types?
+/* As above, but initial val produces years */ +static int +AdjustYears(int val, struct pg_tm *tm, int multiplier) +{ + int years; + if (pg_mul_s32_overflow(val, multiplier, &years)) + return 1; + if (pg_add_s32_overflow(tm->tm_year, years, &tm->tm_year)) + return 1; + return 0; }
particularly since the pg_*_overflow stuff uses bool?
Greetings,
Andres Freund
On Sat, Feb 12, 2022 at 10:51 PM Andres Freund <andres@anarazel.de> wrote:
Any reason for using int return types?
particularly since the pg_*_overflow stuff uses bool?
I chose int return types to keep all these methods
consistent with DecodeInterval, which returns a
non-zero int to indicate an error. Though I wasn't sure
if an int or bool would be best, so I'm happy to change
to bool if people think that's better.
Also I'm realizing now that I've incorrectly been using the
number of the patch to indicate the version, instead of just
sticking a v3 to the front. So sorry about that, all the patches
I sent in this thread are the same patch, just different versions.
- Joe Koshakow
Actually found an additional overflow issue in
DecodeInterval. I've updated the patch with the
fix. Specifically it prevents trying to negate a field
if it equals INT_MIN. Let me know if this is best
put into another patch.
- Joe Koshakow
Attachments:
v4-0001-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Check-for-overflow-when-decoding-an-interval.patchDownload+172-23
Hi,
On 2022-02-13 09:35:47 -0500, Joseph Koshakow wrote:
I chose int return types to keep all these methods
consistent with DecodeInterval, which returns a
non-zero int to indicate an error.
That's different, because it actually returns different types of errors. IMO
that difference is actually reason to use a bool for the new cases, because
then it's a tad clearer that they don't return DTERR_*.
Though I wasn't sure
if an int or bool would be best, so I'm happy to change
to bool if people think that's better.
+1 or bool.
Also I'm realizing now that I've incorrectly been using the
number of the patch to indicate the version, instead of just
sticking a v3 to the front. So sorry about that, all the patches
I sent in this thread are the same patch, just different versions.
No worries ;)
Do we want to consider backpatching these fixes? If so, I'd argue for skipping
10, because it doesn't have int.h stuff yet. There's also the issue with
potentially breaking indexes / constraints? Not that goes entirely away across
major versions...
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
Do we want to consider backpatching these fixes?
I wasn't thinking that we should. It's a behavioral change and
people might not be pleased to have it appear in minor releases.
regards, tom lane
Attached is a new version switching from ints to bools, as requested.
- Joe Koshakow
Attachments:
v5-0001-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Check-for-overflow-when-decoding-an-interval.patchDownload+164-23
On Sun, Feb 13, 2022 at 5:12 PM Joseph Koshakow <koshy44@gmail.com> wrote:
Attached is a new version switching from ints to bools, as requested.
- Joe Koshakow
Tom, Andres,
Thanks for your feedback! I'm not super familiar with the process,
should I submit this thread to the commitfest or just leave it as is?
Thanks,
Joe Koshakow
Hi,
On 2022-02-15 06:44:40 -0500, Joseph Koshakow wrote:
Thanks for your feedback! I'm not super familiar with the process,
should I submit this thread to the commitfest or just leave it as is?
Submit it to the CF - then we get an automatic test on a few platforms. I
think we can apply it quickly after that...
Greetings,
Andres Freund
Ok, so I've attached a patch with my final unprompted changes. It
contains the following two changes:
1. I found some more overflows with the ISO8601 formats and have
included some fixes.
2. I reverted the overflow checks for the seconds field. It's actually a
bit more complicated than I thought. For example consider the following
query:
postgres=# SELECT INTERVAL '0.99999999 min 2147483647 sec';
interval
----------------------
-596523:13:09.000001
(1 row)
This query will overflow the tm_sec field of the struct pg_tm, however
it's not actually out of range for the Interval. I'm not sure the best
way to handle this right now, but I think it would be best to leave it
for a future patch. Perhaps the time related fields in struct pg_tm
need to be changed to 64 bit ints.
- Joe Koshakow
Attachments:
v6-0001-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Check-for-overflow-when-decoding-an-interval.patchDownload+232-25
Copying over a related thread here to have info all in one place.
It's a little fragmented so sorry about that.
On Fri, Feb 18, 2022 at 11:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Joseph Koshakow <koshy44@gmail.com> writes:
When formatting the output of an Interval, we call abs() on the hours
field of the Interval. Calling abs(INT_MIN) returns back INT_MIN
causing the output to contain two '-' characters. The attached patch
fixes that issue by special casing INT_MIN hours.Good catch, but it looks to me like three out of the four formats in
EncodeInterval have variants of this problem --- there are assumptions
throughout that code that we can compute "-x" or "abs(x)" without
fear. Not much point in fixing only one symptom.Also, I notice that there's an overflow hazard upstream of here,
in interval2tm:regression=# select interval '214748364 hours' * 11;
ERROR: interval out of range
regression=# \errverbose
ERROR: 22008: interval out of range
LOCATION: interval2tm, timestamp.c:1982There's no good excuse for not being able to print a value that
we computed successfully.I wonder if the most reasonable fix would be to start using int64
instead of int arithmetic for the values that are potentially large.
I doubt that we'd be taking much of a performance hit on modern
hardware.regards, tom lane
Joseph Koshakow <koshy44@gmail.com> writes:
On Fri, Feb 18, 2022 at 11:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wonder if the most reasonable fix would be to start using int64
instead of int arithmetic for the values that are potentially large.
I doubt that we'd be taking much of a performance hit on modern
hardware.
That's an interesting idea. I've always assumed that the range of the
time fields of Intervals was 2147483647 hours 59 minutes
59.999999 seconds to -2147483648 hours -59 minutes
-59.999999 seconds. However the only reason that we can't support
the full range of int64 microseconds is because the struct pg_tm fields
are only ints. If we increase those fields to int64 then we'd be able to
support the full int64 range for microseconds as well as implicitly fix
some of the overflow issues in DecodeInterval and EncodeInterval.
On Sat, Feb 19, 2022 at 1:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Joseph Koshakow <koshy44@gmail.com> writes:
On Sat, Feb 19, 2022 at 12:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think that messing with struct pg_tm might have too many side-effects.
However, pg_tm isn't all that well adapted to intervals in the first
place, so it'd make sense to make a new struct specifically for interval
decoding.Yeah that's a good idea, pg_tm is used all over the place. Is there a
reason we need an intermediate structure to convert to and from?
We could just convert directly to an Interval in DecodeInterval and
print directly from an Interval in EncodeInterval.Yeah, I thought about that too, but if you look at the other callers of
interval2tm, you'll see this same set of issues. I'm inclined to keep
the current code structure and just fix the data structure. Although
interval2tm isn't *that* big, I don't think four copies of its logic
would be an improvement.Also I originally created a separate thread and patch because I
thought this wouldn't be directly related to my other patch,
https://commitfest.postgresql.org/37/3541/. However I think with these
discussed changes it's directly related. So I think it's best to close
this thread and patch and copy this conversion to the other thread.Agreed.
regards, tom lane
Attached is a patch of my first pass. The to_char method isn't finished
and I need to add a bunch of tests, but everything else is in place. It
ended up being a fairly large change in case anyone wants to take a look
the changes so far.
One thing I noticed is that interval.c has a ton of code copied and pasted
from other files. The code seemed out of date from those other files, so
I tried to bring it up to date and add my changes.
- Joe Koshakow
Attachments:
v7-0001-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Check-for-overflow-when-decoding-an-interval.patchDownload+946-383
Joseph Koshakow <koshy44@gmail.com> writes:
Attached is a patch of my first pass. The to_char method isn't finished
and I need to add a bunch of tests, but everything else is in place. It
ended up being a fairly large change in case anyone wants to take a look
the changes so far.
Couple of quick comments:
* You've assumed in a number of places that long == int64. This is wrong
on many platforms. Current project style for use of int64 in printf-type
calls is to cast the argument to (long long) explicitly and use "%lld" (or
"%llu", etc). As for strtoint_64, surely that functionality exists
somewhere already (hopefully with fewer bugs than this has).
* I think that tools like Coverity will complain about how you've
got both calls that check the result of AdjustFractSeconds (etc)
and calls that don't. You might consider coding the latter like
+ /* this can't overflow: */
+ (void) AdjustFractSeconds(fval, itm, fsec, SECS_PER_DAY);
in hopes of (a) silencing those warnings and (b) making the implicit
assumption clear to readers.
* I'm not entirely buying use of pg_time_t, rather than just int64,
in struct itm. I think the point of this struct is to get away
from datetime-specific datatypes. Also, if pg_time_t ever changed
width, it'd create issues for users of this struct. I also note
a number of places in the patch that are assuming that these fields
are int64 not something else.
* I'm a little inclined to rename interval2tm/tm2interval to
interval2itm/itm2interval, as I think it'd be confusing for those
function names to refer to a typedef they no longer use.
* The uses of tm2itm make me a bit itchy. Is that sweeping
upstream-of-there overflow problems under the rug?
* // comments are not project style, either. While pgindent will
convert them to /* ... */ style, the results might not be pleasing.
One thing I noticed is that interval.c has a ton of code copied and pasted
from other files. The code seemed out of date from those other files, so
I tried to bring it up to date and add my changes.
We haven't actually maintained ecpg's copies of backend datatype-specific
code in many years. While bringing that stuff up to speed might be
worthwhile (or perhaps not, given the lack of complaints), I'd see it
as a separate project.
regards, tom lane
On Sun, Feb 20, 2022 at 6:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Couple of quick comments:
Thanks for the comments Tom, I'll work on fixing these and submit a
new patch.
* The uses of tm2itm make me a bit itchy. Is that sweeping
upstream-of-there overflow problems under the rug?
I agree, I'm not super happy with that approach. In fact
I'm pretty sure it will cause queries like
SELECT INTERVAL '2147483648:00:00';
to overflow upstream, even though queries like
SELECT INTERVAL '2147483648 hours';
would not. The places tm2itm is being used are
* After DecodeTime
* In interval_to_char.
The more general issue is how to share code with
functions that are doing almost identical things but use
pg_tm instead of the new pg_itm? I'm not really sure what
the best solution is right now but I will think about it. If
anyone has suggestions though, feel free to chime in.
- Joe Koshakow
Hi All,
Sorry for the delay in the new patch, I've attached my most recent
patch to this email. I ended up reworking a good portion of my previous
patch so below I've included some reasons why, notes on my current
approach, and some pro/cons to the approach.
* The main reason for the rework had to do with double conversions and
shared code.
* The existing code for rounding had a lot of int to double
casting and vice versa. I *think* that doubles are able to completely
represent the range of ints. However doubles are not able to represent
the full range of int64. After making the change I started noticing
a lot of lossy behavior. One thought I had was to change the doubles
to long doubles, but I wasn't able to figure out if long doubles could
completely represent the range of int64. Especially since their size
varies depending on the architecture. Does anyone know the answer to
this?
* I ended up creating two intermediate data structures for Intervals.
One for decoding and one for everything else. I'll go into more detail
below.
* One common benefit was that they both contain a usec field which
means that the Interval methods no longer need to carry around a
separate fsec argument.
* The obvious con here is that Intervals require two unique
intermediate data structures, while all other date/time types
can share a single intermediate data structure. I find this to
be a bit clunky.
* pg_itm_in is the struct used for Interval decoding. It's very similar
to pg_tm, except all of the time related fields are collapsed into a
single `int64 usec` field.
* The biggest benefit of this was that all int64-double conversions
are limited to a single function, AdjustFractMicroseconds. Instead
of fractional units flowing down over every single time field, they
only need to flow down into the single `int64 usec` field.
* Overflows are caught much earlier in the decoding process which
helps avoid wasted work.
* I found that the decoding code became simpler for time fields,
though this is a bit subjective.
* pg_itm is the struct used for all other Interval functionality. It's
very similar to pg_tm, except the tm_hour field is converted from int
to int64 and an `int tm_usec` field was added.
* When encoding and working with Intervals, we almost always want
to break the time field out into hours, min, sec, usec. So it's
helpful to have a common place to do this, instead of every
function duplicating this code.
* When breaking the time fields out, a single field will never
contain a value greater than could have fit in the next unit
higher. Meaning that minutes will never be greater than 60, seconds
will be never greater than 60, and usec will never be greater than
1,000. So hours is actually the only field that needs to be int64
and the rest can be an int.
* This also helps limit the impact to shared code (see below).
* There's some shared code between Intervals and other date/time types.
Specifically the DecodeTime function and the datetime_to_char_body
function. These functions take in a `struct pg_tm` and a `fsec_t fsec`
(fsec_t is just an alias for int32) which allows them to be re-used by
all date/time types. The only difference now between pg_tm and pg_itm
is the tm_hour field size (the tm_usec field in pg_itm can be used as
the fsec). So to get around this I changed the function signatures to
take a `struct pg_tm`, `fsec_t fsec`, and an `int64 hour` argument.
It's up to the caller to provide to correct hour field. Intervals can
easily convert pg_itm to a pg_tm, fsec, and hour. It's honestly a bit
error-prone since those functions have to explicitly ignore the
pg_tm->tm_hour field and use the provided hour argument instead, but I
couldn't think of a better less intrusive solution. If anyone has a
better idea, please don't hesitate to bring it up.
* This partly existed in the previous patch, but I just wanted to
restate it. All modifications to pg_itm_in during decoding is done via
helper functions that check for overflow. All invocations of these
functions either return an error on overflow or explicitly state why an
overflow is impossible.
* I completely rewrote the ParseISO8601Number function to try and avoid
double to int64 conversions. I tried to model it after the parsing done
in DecodeInterval, though I would appreciate extra scrutiny here.
- Joe Koshakow
Attachments:
v8-0001-Check-for-overflow-when-decoding-an-interval.patchtext/x-patch; charset=US-ASCII; name=v8-0001-Check-for-overflow-when-decoding-an-interval.patchDownload+1410-410
I just realized another issue today. It may have been obvious from one
of Tom's earlier messages, but I'm just now putting the pieces
together.
On Fri, Feb 18, 2022 at 11:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Also, I notice that there's an overflow hazard upstream of here,
in interval2tm:regression=# select interval '214748364 hours' * 11;
ERROR: interval out of range
regression=# \errverbose
ERROR: 22008: interval out of range
LOCATION: interval2tm, timestamp.c:1982There's no good excuse for not being able to print a value that
we computed successfully.
Scenarios like this can properly decode the interval, but actually
error out when encoding the interval. As a consequence you can insert
the value successfully into a table, but any attempt to query the table
that includes the "bad interval" value in the result will cause an
error. Below I've demonstrated an example:
postgres=# CREATE TABLE tbl (i INTERVAL);
CREATE TABLE
postgres=# INSERT INTO tbl VALUES ('1 day'), ('3 months'), ('2 years');
INSERT 0 3
postgres=# SELECT * FROM tbl;
i
---------
1 day
3 mons
2 years
(3 rows)
postgres=# INSERT INTO tbl VALUES ('2147483647 hours 60 minutes');
INSERT 0 1
postgres=# SELECT * FROM tbl;
ERROR: interval out of range
This would seriously reduce the usable of any table that contains one
of these "bad interval" values.
My patch actually fixes this issue, but I just wanted to call it out
because it might be relevant when reviewing.
Joseph Koshakow <koshy44@gmail.com> writes:
[ v8-0001-Check-for-overflow-when-decoding-an-interval.patch ]
This isn't applying per the cfbot; looks like it got sideswiped
by 9e9858389. Here's a quick rebase. I've not reviewed it, but
I did notice (because git was in my face about this) that it's
got whitespace issues. Please try to avoid unnecessary whitespace
changes ... pgindent will clean those up, but it makes reviewing
harder.
regards, tom lane