BUG #16804: substring() function returns "negative substring length" error when using a large length argument
The following bug has been logged on the website:
Bug reference: 16804
Logged by: Rafi Shamim
Email address: rafiss@gmail.com
PostgreSQL version: 13.1
Operating system: MacOS 10.15.7
Description:
Reproduction steps:
1. Connect to any PostgreSQL database.
2. Run this query:
select substring('string' from 2 for 2147483646);
Actual result:
2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT: select substring('string'
from 2 for 2147483646)
negative substring length not allowed
Expected result:
I don't know if this is covered by these docs:
https://www.postgresql.org/docs/13/functions-string.html
But I would expect one of the following:
1. There should be no error message, and the result should be 'tring'.
(Meaning it just goes to the end of the string.)
2. An error message that says that the length argument is too long.
I don't know what the SQL standard says, but when I try similar queries with
SQLite and MSSQL, the result is 'tring'.
po 4. 1. 2021 v 19:44 odesílatel PG Bug reporting form <
noreply@postgresql.org> napsal:
The following bug has been logged on the website:
Bug reference: 16804
Logged by: Rafi Shamim
Email address: rafiss@gmail.com
PostgreSQL version: 13.1
Operating system: MacOS 10.15.7
Description:Reproduction steps:
1. Connect to any PostgreSQL database.
2. Run this query:select substring('string' from 2 for 2147483646);
Actual result:
2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT:s
negative substring length not allowedExpected result:
I don't know if this is covered by these docs:
https://www.postgresql.org/docs/13/functions-string.html
But I would expect one of the following:
1. There should be no error message, and the result should be 'tring'.
(Meaning it just goes to the end of the string.)
2. An error message that says that the length argument is too long.
I don't know what the SQL standard says, but when I try similar queries
with
SQLite and MSSQL, the result is 'tring'.
Minimally this is a bug and it should raise an error "integer out of
range". Probably in this case we can use MAX_INT as a special value of
unlimited length, although it is a little bit scary, because length is an
optional value. The attached patch should fix this issue. I do not have
access to Oracle to check the behaviour of this case there.
Regards
Pavel
Attachments:
fix-substring-intoverflow.patchtext/x-patch; charset=US-ASCII; name=fix-substring-intoverflow.patchDownload+53-43
Hi,
On Jan 4, 2021, at 11:21 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
po 4. 1. 2021 v 19:44 odesílatel PG Bug reporting form <noreply@postgresql.org <mailto:noreply@postgresql.org>> napsal:
select substring('string' from 2 for 2147483646);
Actual result:
2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT:s
negative substring length not allowedMinimally this is a bug and it should raise an error "integer out of range". Probably in this case we can use MAX_INT as a special value of unlimited length, although it is a little bit scary, because length is an optional value. The attached patch should fix this issue. I do not have access to Oracle to check the behaviour of this case there.
Except according to the pg docs, it’s not out of range, it’s one less than MAX_INT.
The manual calls for it to be an integer, which is defined as:
integer 4 bytes typical choice for integer -2147483648 to +2147483647
The original bug report is one less than +2147483647, and thus should be a valid value, no?
po 4. 1. 2021 v 20:25 odesílatel Jerry Sievert <jerry@legitimatesounding.com>
napsal:
Hi,
On Jan 4, 2021, at 11:21 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:po 4. 1. 2021 v 19:44 odesílatel PG Bug reporting form <
noreply@postgresql.org> napsal:select substring('string' from 2 for 2147483646);
Actual result:
2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT:s
negative substring length not allowedMinimally this is a bug and it should raise an error "integer out of
range". Probably in this case we can use MAX_INT as a special value of
unlimited length, although it is a little bit scary, because length is an
optional value. The attached patch should fix this issue. I do not have
access to Oracle to check the behaviour of this case there.Except according to the pg docs, it’s not out of range, it’s one less than
MAX_INT.The manual calls for it to be an integer, which is defined as:
integer 4 bytes typical choice for integer -2147483648 to +2147483647
The original bug report is one less than +2147483647, and thus should be a
valid value, no?
yes, so the implementation patch is really correct.
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes:
po 4. 1. 2021 v 20:25 odesilatel Jerry Sievert <jerry@legitimatesounding.com>
napsal:The original bug report is one less than +2147483647, and thus should be a
valid value, no?
yes, so the implementation patch is really correct.
I agree that this is a bug, and that what we should do in case of integer
overflow is return all the rest of the string. But this patch doesn't
really get the job done, because you haven't accounted for *negative*
overflow, ie, S+L < INT_MIN. I think the best way to fix that is to
explicitly check for L < 0 rather than trying to wait till after the
addition.
Looking around, I notice that there's an unprotected multiplication
further down. Also, having seen this, I feel very uncomfortable about
the fact that detoast_attr_slice and cohorts aren't guarding against
the same sort of integer overflow. I don't think this is the only
caller that might pass out-of-range values.
In short, I think we need something more like the attached.
regards, tom lane
Attachments:
more-overflow-checks-for-substrings-2.patchtext/x-diff; charset=us-ascii; name=more-overflow-checks-for-substrings-2.patchDownload+88-39
I forgot to include this in my original bug report, but the bytea substring
function (which has a separate implementation in varlena.c) is also
affected.
SELECT encode(substring(decode('010203', 'hex'), 2, 2147483646), 'hex');
2021-01-04 16:50:22.020 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 16:50:22.020 EST [85734] STATEMENT: SELECT
encode(substring(decode('010203', 'hex'), 2, 2147483646), 'hex')
negative substring length not allowed
On Mon, Jan 4, 2021 at 4:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Pavel Stehule <pavel.stehule@gmail.com> writes:
po 4. 1. 2021 v 20:25 odesilatel Jerry Sievert <
jerry@legitimatesounding.com>
napsal:
The original bug report is one less than +2147483647, and thus should
be a
valid value, no?
yes, so the implementation patch is really correct.
I agree that this is a bug, and that what we should do in case of integer
overflow is return all the rest of the string. But this patch doesn't
really get the job done, because you haven't accounted for *negative*
overflow, ie, S+L < INT_MIN. I think the best way to fix that is to
explicitly check for L < 0 rather than trying to wait till after the
addition.Looking around, I notice that there's an unprotected multiplication
further down. Also, having seen this, I feel very uncomfortable about
the fact that detoast_attr_slice and cohorts aren't guarding against
the same sort of integer overflow. I don't think this is the only
caller that might pass out-of-range values.In short, I think we need something more like the attached.
regards, tom lane
Rafi Shamim <rafiss@gmail.com> writes:
I forgot to include this in my original bug report, but the bytea substring
function (which has a separate implementation in varlena.c) is also
affected.
Um. And I bet the "bit" variant as well ...
regards, tom lane