BUG #15474: Special character escape sequences need better documentation, or more easily found documentation
The following bug has been logged on the website:
Bug reference: 15474
Logged by: Micheal Taylor
Email address: bubthegreat@gmail.com
PostgreSQL version: 11.0
Operating system: Any
Description:
When looking for postgresql documentation on characters that need to be
escaped, and how to escape them, I consistently get to this page:
https://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-SPECIAL-CHARS
That page goes over high levels of escaping numerous things, but in the
special characters and operators, it doesn't clearly show how to escape the
operators or special characters within a more complicated query structure.
For example, if I have the following table:
CREATE TABLE IF NOT EXISTS {tablename}
(
time TIMESTAMP,
case_number VARCHAR(25),
jira VARCHAR(25),
status VARCHAR(25),
fqdn VARCHAR(255),
subject TEXT,
description TEXT,
comment TEXT
)
And the following insertion information:
INSERT INTO orphans(time, case_number, fqdn, status, subject,
description)
SELECT '{timestamp}', '{case_number}', '{fqdn}', 'new', '{subject}',
'{description}'
WHERE
NOT EXISTS (
SELECT case_number, status FROM orphans
WHERE case_number = '{case_number}'
)
Where all columns inputs are strings, if any of those inputs have special
characters like % or ', it is not clear by quick inspection of the
documentation how to escape those characters. A simple table elaborating on
the escapes for each special character would be incredibly helpful at
determining how to translate those escapes for cleaning strings prior to
insertion so those of us using postgresql can quickly write cleaning
functions for data.
On Wednesday, October 31, 2018, PG Bug reporting form <
noreply@postgresql.org> wrote:
Where all columns inputs are strings, if any of those inputs have special
characters like % or '
The only character needing escaping in a normal literal is the single quote
since it identifies the end of the literal otherwise. For an escaped
string the escaping character (backslash) also needs to be protected.
These are the only two and are documented in the sections covering those
types. A table containing one or maybe two rows doesn’t seem like an
improvement.
In both cases doubling up the special character protects its literal
meaning.
Specifically, % is not a special character for literals - though it can be
for a particular operator or function. Said operator will then document
how it is used.
David J.
On Wednesday, October 31, 2018, PG Bug reporting form <
noreply@postgresql.org> wrote:
A simple table elaborating on
the escapes for each special character would be incredibly helpful at
determining how to translate those escapes for cleaning strings prior to
insertion so those of us using postgresql can quickly write cleaning
functions for data.
Translating escapes and writing cleaning functions should be a rare need
unless you are writing driver-level code. Applications should avoid caring
about either by avoiding dynamic sql altogether or using quote_* functions
or the format function.
David J.
"PG" == PG Bug reporting form <noreply@postgresql.org> writes:
PG> A simple table elaborating on the escapes for each special
PG> character would be incredibly helpful at determining how to
PG> translate those escapes for cleaning strings prior to insertion so
PG> those of us using postgresql can quickly write cleaning functions
PG> for data.
If you're "writing cleaning functions" you're already making a serious
mistake, because you should be passing data values as parameters (which
do not require escapes) rather than interpolating into the query string.
If you actually do need to interpolate into the query string for some
reason (like doing COPY or other utility statement that doesn't support
parameters), then you should be using the quote/escape functions
provided by the driver for your client language (e.g. in libpq there is
PQescapeStringConn).
--
Andrew (irc:RhodiumToad)
tl;dr: If this is just an RTFM moment, then I'm happy to chalk it up to
that and move on with my life. :)
I'm coming from a background of devops (mostly python/C++), not database
administration, so this may be something that's fairly straightforward for
a different audience - but this wasn't clear to me as someone who's coming
into it from the outside. An obscure stackoverflow answer is where I
landed how to fix it rather than being able to find it in the
documentation, so I figured I'd submit it as an improvement request. If
I'm super off base on it not being clear, then feel free to close - but I
felt like the point of documentation is that it should answer those kinds
of questions rather than googling other places. Even just a link to the
documentation for those would be very helpful as a "Looking for this? It's
<here>" for the special character documentation that David mentioned would
have been extremely helpful IMHO.
On Wed, Oct 31, 2018 at 10:51 AM Andrew Gierth <andrew@tao11.riddles.org.uk>
wrote:
"PG" == PG Bug reporting form <noreply@postgresql.org> writes:
PG> A simple table elaborating on the escapes for each special
PG> character would be incredibly helpful at determining how to
PG> translate those escapes for cleaning strings prior to insertion so
PG> those of us using postgresql can quickly write cleaning functions
PG> for data.If you're "writing cleaning functions" you're already making a serious
mistake, because you should be passing data values as parameters (which
do not require escapes) rather than interpolating into the query string.If you actually do need to interpolate into the query string for some
reason (like doing COPY or other utility statement that doesn't support
parameters), then you should be using the quote/escape functions
provided by the driver for your client language (e.g. in libpq there is
PQescapeStringConn).--
Andrew (irc:RhodiumToad)
--
Mike Taylor
bubthegreat@gmail.com
(801)913-9767
btw - You guys are super responsive. Very nice to experience
On Wed, Oct 31, 2018 at 11:23 AM Mike Taylor <bubthegreat@gmail.com> wrote:
tl;dr: If this is just an RTFM moment, then I'm happy to chalk it up to
that and move on with my life. :)I'm coming from a background of devops (mostly python/C++), not database
administration, so this may be something that's fairly straightforward for
a different audience - but this wasn't clear to me as someone who's coming
into it from the outside. An obscure stackoverflow answer is where I
landed how to fix it rather than being able to find it in the
documentation, so I figured I'd submit it as an improvement request. If
I'm super off base on it not being clear, then feel free to close - but I
felt like the point of documentation is that it should answer those kinds
of questions rather than googling other places. Even just a link to the
documentation for those would be very helpful as a "Looking for this? It's
<here>" for the special character documentation that David mentioned would
have been extremely helpful IMHO.On Wed, Oct 31, 2018 at 10:51 AM Andrew Gierth <
andrew@tao11.riddles.org.uk> wrote:"PG" == PG Bug reporting form <noreply@postgresql.org> writes:
PG> A simple table elaborating on the escapes for each special
PG> character would be incredibly helpful at determining how to
PG> translate those escapes for cleaning strings prior to insertion so
PG> those of us using postgresql can quickly write cleaning functions
PG> for data.If you're "writing cleaning functions" you're already making a serious
mistake, because you should be passing data values as parameters (which
do not require escapes) rather than interpolating into the query string.If you actually do need to interpolate into the query string for some
reason (like doing COPY or other utility statement that doesn't support
parameters), then you should be using the quote/escape functions
provided by the driver for your client language (e.g. in libpq there is
PQescapeStringConn).--
Andrew (irc:RhodiumToad)--
Mike Taylor
bubthegreat@gmail.com
(801)913-9767
--
Mike Taylor
bubthegreat@gmail.com
(801)913-9767
On Wed, Oct 31, 2018 at 10:34 AM Mike Taylor <bubthegreat@gmail.com> wrote:
Even just a link to the documentation for those would be very helpful as a
"Looking for this? It's <here>" for the special character documentation
that David mentioned would have been extremely helpful IMHO.
Not sure where you are expecting this "link to the documentation" to
exist...
"SQL Syntax" is a top-level item in the Table of Contents of the
documentation; Under that "Constants" seems like the right thing to call
these hand-written literals so that seems adequate as well. Given your
background in programming languages I would expect those two headings to be
sufficient to point you to the correct location in the documentation to
learn how to write SQL literals. Everyone has unique experiences though
and that's why the lists exist. The -general list would be a more
appropriate place to make sure inquiries though as this isn't a bug.
David J.
On 31/10/2018 18:51, Andrew Gierth wrote:
"PG" == PG Bug reporting form <noreply@postgresql.org> writes:
PG> A simple table elaborating on the escapes for each special
PG> character would be incredibly helpful at determining how to
PG> translate those escapes for cleaning strings prior to insertion so
PG> those of us using postgresql can quickly write cleaning functions
PG> for data.If you're "writing cleaning functions" you're already making a serious
mistake, because you should be passing data values as parameters (which
do not require escapes) rather than interpolating into the query string.If you actually do need to interpolate into the query string for some
reason (like doing COPY or other utility statement that doesn't support
parameters), then you should be using the quote/escape functions
provided by the driver for your client language (e.g. in libpq there is
PQescapeStringConn).
Agreed. As pointed out in this report, though, the documentation doesn't
say that. This section is part of the "Lexical structure" chapter, so
it's perhaps more aimed at people writing drivers or SQL code
generators, than general application authors. But when someone like the
OP lands on that page, how can he tell?
It might be a good idea to add a note somewhere in there along the lines of:
"NOTE: All popular client libraries have functions for correctly quoting
and escaping user input, for use in string literals or SQL identifiers.
Most applications should use those, or use out-of-band query parameters,
instead of trying to follow the rules explained here directly. Please
refer to the documentation of your programming language or driver on how
to do that. The libpq quoting/escaping functions are explained in
https://www.postgresql.org/docs/current/static/libpq-exec.html#LIBPQ-EXEC-ESCAPE-STRING".
- Heikki