BUG #16804: substring() function returns "negative substring length" error when using a large length argument

Started by PG Bug reporting formover 5 years ago7 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16804
Logged by: Rafi Shamim
Email address: rafiss@gmail.com
PostgreSQL version: 13.1
Operating system: MacOS 10.15.7
Description:

Reproduction steps:

1. Connect to any PostgreSQL database.
2. Run this query:

select substring('string' from 2 for 2147483646);

Actual result:

2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT: select substring('string'
from 2 for 2147483646)
negative substring length not allowed

Expected result:

I don't know if this is covered by these docs:
https://www.postgresql.org/docs/13/functions-string.html
But I would expect one of the following:
1. There should be no error message, and the result should be 'tring'.
(Meaning it just goes to the end of the string.)
2. An error message that says that the length argument is too long.

I don't know what the SQL standard says, but when I try similar queries with
SQLite and MSSQL, the result is 'tring'.

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #16804: substring() function returns "negative substring length" error when using a large length argument

po 4. 1. 2021 v 19:44 odesílatel PG Bug reporting form <
noreply@postgresql.org> napsal:

The following bug has been logged on the website:

Bug reference: 16804
Logged by: Rafi Shamim
Email address: rafiss@gmail.com
PostgreSQL version: 13.1
Operating system: MacOS 10.15.7
Description:

Reproduction steps:

1. Connect to any PostgreSQL database.
2. Run this query:

select substring('string' from 2 for 2147483646);

Actual result:

2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT:s
negative substring length not allowed

Expected result:

I don't know if this is covered by these docs:
https://www.postgresql.org/docs/13/functions-string.html
But I would expect one of the following:
1. There should be no error message, and the result should be 'tring'.
(Meaning it just goes to the end of the string.)
2. An error message that says that the length argument is too long.

I don't know what the SQL standard says, but when I try similar queries
with
SQLite and MSSQL, the result is 'tring'.

Minimally this is a bug and it should raise an error "integer out of
range". Probably in this case we can use MAX_INT as a special value of
unlimited length, although it is a little bit scary, because length is an
optional value. The attached patch should fix this issue. I do not have
access to Oracle to check the behaviour of this case there.

Regards

Pavel

Attachments:

fix-substring-intoverflow.patchtext/x-patch; charset=US-ASCII; name=fix-substring-intoverflow.patchDownload+53-43
#3Jerry Sievert
jerry@legitimatesounding.com
In reply to: Pavel Stehule (#2)
Re: BUG #16804: substring() function returns "negative substring length" error when using a large length argument

Hi,

On Jan 4, 2021, at 11:21 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

po 4. 1. 2021 v 19:44 odesílatel PG Bug reporting form <noreply@postgresql.org <mailto:noreply@postgresql.org>> napsal:

select substring('string' from 2 for 2147483646);

Actual result:

2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT:s
negative substring length not allowed

Minimally this is a bug and it should raise an error "integer out of range". Probably in this case we can use MAX_INT as a special value of unlimited length, although it is a little bit scary, because length is an optional value. The attached patch should fix this issue. I do not have access to Oracle to check the behaviour of this case there.

Except according to the pg docs, it’s not out of range, it’s one less than MAX_INT.

The manual calls for it to be an integer, which is defined as:

integer 4 bytes typical choice for integer -2147483648 to +2147483647

The original bug report is one less than +2147483647, and thus should be a valid value, no?

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jerry Sievert (#3)
Re: BUG #16804: substring() function returns "negative substring length" error when using a large length argument

po 4. 1. 2021 v 20:25 odesílatel Jerry Sievert <jerry@legitimatesounding.com>
napsal:

Hi,

On Jan 4, 2021, at 11:21 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

po 4. 1. 2021 v 19:44 odesílatel PG Bug reporting form <
noreply@postgresql.org> napsal:

select substring('string' from 2 for 2147483646);

Actual result:

2021-01-04 12:43:13.145 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 12:43:13.145 EST [85734] STATEMENT:s
negative substring length not allowed

Minimally this is a bug and it should raise an error "integer out of

range". Probably in this case we can use MAX_INT as a special value of
unlimited length, although it is a little bit scary, because length is an
optional value. The attached patch should fix this issue. I do not have
access to Oracle to check the behaviour of this case there.

Except according to the pg docs, it’s not out of range, it’s one less than
MAX_INT.

The manual calls for it to be an integer, which is defined as:

integer 4 bytes typical choice for integer -2147483648 to +2147483647

The original bug report is one less than +2147483647, and thus should be a
valid value, no?

yes, so the implementation patch is really correct.

Regards

Pavel

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#4)
Re: BUG #16804: substring() function returns "negative substring length" error when using a large length argument

Pavel Stehule <pavel.stehule@gmail.com> writes:

po 4. 1. 2021 v 20:25 odesilatel Jerry Sievert <jerry@legitimatesounding.com>
napsal:

The original bug report is one less than +2147483647, and thus should be a
valid value, no?

yes, so the implementation patch is really correct.

I agree that this is a bug, and that what we should do in case of integer
overflow is return all the rest of the string. But this patch doesn't
really get the job done, because you haven't accounted for *negative*
overflow, ie, S+L < INT_MIN. I think the best way to fix that is to
explicitly check for L < 0 rather than trying to wait till after the
addition.

Looking around, I notice that there's an unprotected multiplication
further down. Also, having seen this, I feel very uncomfortable about
the fact that detoast_attr_slice and cohorts aren't guarding against
the same sort of integer overflow. I don't think this is the only
caller that might pass out-of-range values.

In short, I think we need something more like the attached.

regards, tom lane

Attachments:

more-overflow-checks-for-substrings-2.patchtext/x-diff; charset=us-ascii; name=more-overflow-checks-for-substrings-2.patchDownload+88-39
#6Rafi Shamim
rafiss@gmail.com
In reply to: Tom Lane (#5)
Re: BUG #16804: substring() function returns "negative substring length" error when using a large length argument

I forgot to include this in my original bug report, but the bytea substring
function (which has a separate implementation in varlena.c) is also
affected.

SELECT encode(substring(decode('010203', 'hex'), 2, 2147483646), 'hex');

2021-01-04 16:50:22.020 EST [85734] ERROR: negative substring length not
allowed
2021-01-04 16:50:22.020 EST [85734] STATEMENT: SELECT
encode(substring(decode('010203', 'hex'), 2, 2147483646), 'hex')
negative substring length not allowed

On Mon, Jan 4, 2021 at 4:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Pavel Stehule <pavel.stehule@gmail.com> writes:

po 4. 1. 2021 v 20:25 odesilatel Jerry Sievert <

jerry@legitimatesounding.com>

napsal:

The original bug report is one less than +2147483647, and thus should

be a

valid value, no?

yes, so the implementation patch is really correct.

I agree that this is a bug, and that what we should do in case of integer
overflow is return all the rest of the string. But this patch doesn't
really get the job done, because you haven't accounted for *negative*
overflow, ie, S+L < INT_MIN. I think the best way to fix that is to
explicitly check for L < 0 rather than trying to wait till after the
addition.

Looking around, I notice that there's an unprotected multiplication
further down. Also, having seen this, I feel very uncomfortable about
the fact that detoast_attr_slice and cohorts aren't guarding against
the same sort of integer overflow. I don't think this is the only
caller that might pass out-of-range values.

In short, I think we need something more like the attached.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rafi Shamim (#6)
Re: BUG #16804: substring() function returns "negative substring length" error when using a large length argument

Rafi Shamim <rafiss@gmail.com> writes:

I forgot to include this in my original bug report, but the bytea substring
function (which has a separate implementation in varlena.c) is also
affected.

Um. And I bet the "bit" variant as well ...

regards, tom lane