BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

Started by Joshua Yanovskiover 12 years ago5 messagesbugs

pythonesque@gmail.com

over 12 years ago

The following bug has been logged on the website:

Bug reference: 9204
Logged by: Joshua Yanovski
Email address: pythonesque@gmail.com
PostgreSQL version: 9.3.2
Operating system: Ubuntu 12.0.4
Description:

As in description. This follows from how these are scanned in scan.l:

ident = litbuf_udeescape('\\', yyscanner);
if (yyextra->literallen >= NAMEDATALEN)
truncate_identifier(ident, yyextra->literallen, true);

Because literallen is the length of the original string, this does
unnecessary work (and reports a misleading notice) if the resulting string
is shorter.

psql -v 'VERBOSITY=verbose' -c "select
U&\"abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcd\\3737\"
FROM dummy"
NOTICE: 42622: identifier
"abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcd㜷" will be
truncated to
"abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcd㜷"
LOCATION: truncate_identifier, scansup.c:195

It is a pretty borderline edge case and doesn't have any serious
consequences, but it does seem like it should be easy to fix without a huge
hit to efficiency, considering that the length can be calculated in constant
time from known information in litbuf_udeescape.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Joshua Yanovski (#1)

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

pythonesque@gmail.com writes:

As in description. This follows from how these are scanned in scan.l:

ident = litbuf_udeescape('\\', yyscanner);
if (yyextra->literallen >= NAMEDATALEN)
truncate_identifier(ident, yyextra->literallen, true);

Yeah, that's a bug --- yyextra->literallen is not the thing to use here.
It's just luck that truncate_identifier doesn't fail entirely, since
we're violating its API contract. Will fix, thanks for reporting it.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Joshua Yanovski

pythonesque@gmail.com

over 12 years ago

In reply to: Tom Lane (#2)

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

There is one other thing I noticed in that area of the code--namely, if
NAMEDATALEN is low enough, an identifier can be truncated down to an empty
identifier, since the check for empty identifier length is done before the
call to truncate_identifier. But I doubt this will ever be a problem in
practice and there may be other compensatory checks elsewhere.

On Thu, Feb 13, 2014 at 9:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

pythonesque@gmail.com writes:

As in description. This follows from how these are scanned in scan.l:

ident = litbuf_udeescape('\\', yyscanner);
if (yyextra->literallen >= NAMEDATALEN)
truncate_identifier(ident, yyextra->literallen, true);

Yeah, that's a bug --- yyextra->literallen is not the thing to use here.
It's just luck that truncate_identifier doesn't fail entirely, since
we're violating its API contract. Will fix, thanks for reporting it.

regards, tom lane

--
Josh

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Joshua Yanovski (#3)

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

Joshua Yanovski <pythonesque@gmail.com> writes:

There is one other thing I noticed in that area of the code--namely, if
NAMEDATALEN is low enough, an identifier can be truncated down to an empty
identifier, since the check for empty identifier length is done before the
call to truncate_identifier. But I doubt this will ever be a problem in
practice and there may be other compensatory checks elsewhere.

That'd only be possible if NAMEDATALEN were smaller than the longest
possible multibyte character, which I think is not a case we need to
concern ourselves with. We currently don't support multibytes longer
than 4 bytes, and even if we do full Unicode somewhere down the line,
it'd still only be 6 bytes. I can't imagine anyone wanting to run
with NAMEDATALEN less than 16 or so --- even if they tried, it'd likely
not work because of conflicts in the names of built-in functions.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Joshua Yanovski

pythonesque@gmail.com

over 12 years ago

In reply to: Tom Lane (#4)

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

Yeah, I agree that it will never be a problem in a real database--just
thought I'd bring it up since it was something I noticed and I couldn't
find any explicit minimum value for it :) Thanks for fixing this!

--
Josh