BUG #4027: backslash escaping not disabled in plpgsql
The following bug has been logged online:
Bug reference: 4027
Logged by: Jonathan Guthrie
Email address: jguthrie@brokersys.com
PostgreSQL version: 8.3.0
Operating system: Debian Gnu/Linux "unstable" 2.6.24
Description: backslash escaping not disabled in plpgsql
Details:
I have set the standard_conforming_strings to "on" in my settings, and have
verified it by executing a
select '\';
which works fine and produces the expected:
?column?
----------
\
(1 row)
However, when I attempt to define this function:
create function foo (out r refcursor) as $bar$
begin
open r for
select * from user_data
where name_first like name escape '\';
end; $bar$ language plpgsql;
it complains about an unterminated string. ("ERROR: unterminated string")
However, if I double the backslashes, it compiles just fine, and does not
emit a warning even though escape_string_warning is also set to 'on'. As
expected, the system does emit a warning when I double the backslashes and
when standard_conforming_strings is set to 'off'. I also have
backslash_quote set to 'off', but it doesn't seem to change anything in this
case.
I believe that this is incorrect behavior and that the backslash should be
just a character in that string when standard_conforming_strings is set to
'on'.
"Jonathan Guthrie" <jguthrie@brokersys.com> writes:
I have set the standard_conforming_strings to "on" in my settings ...
However, when I attempt to define this function:
create function foo (out r refcursor) as $bar$
begin
open r for
select * from user_data
where name_first like name escape '\';
end; $bar$ language plpgsql;
plpgsql does not consider standard_conforming_strings --- it still uses
backslash escaping in its function bodies regardless. Since the
language itself is not standardized, I see no particular reason that
standard_conforming_strings should govern it. I believe the reason for
not changing it was that it seemed too likely to break existing
functions, with potentially nasty consequences if they chanced to be
security definers.
regards, tom lane
Tom Lane wrote:
plpgsql does not consider standard_conforming_strings --- it still uses
backslash escaping in its function bodies regardless. Since the
language itself is not standardized, I see no particular reason that
standard_conforming_strings should govern it.
I think plpgsql should behave either consistently with the rest of PostgreSQL
or with Oracle, which it is copied from.
I believe the reason for
not changing it was that it seemed too likely to break existing
functions, with potentially nasty consequences if they chanced to be
security definers.
Is this actually true or did we just forget it? :-)
Peter Eisentraut <peter_e@gmx.net> writes:
Tom Lane wrote:
I believe the reason for
not changing it was that it seemed too likely to break existing
functions, with potentially nasty consequences if they chanced to be
security definers.
Is this actually true or did we just forget it? :-)
I recall thinking about the point. The decision could've been wrong ...
regards, tom lane
Peter Eisentraut wrote:
Tom Lane wrote:
plpgsql does not consider standard_conforming_strings --- it still uses
backslash escaping in its function bodies regardless. Since the
language itself is not standardized, I see no particular reason that
standard_conforming_strings should govern it.I think plpgsql should behave either consistently with the rest of PostgreSQL
or with Oracle, which it is copied from.
Agreed. standard_conforming_strings should affect _all_ strings.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote:
Agreed. standard_conforming_strings should affect _all_ strings.
We might need another transition period over a few releases with a
separate "plpgsql_standard_conforming_strings" parameter. Just changing it
immediately is perhaps a bit risky.
Peter Eisentraut wrote:
Bruce Momjian wrote:
Agreed. standard_conforming_strings should affect _all_ strings.
We might need another transition period over a few releases with a
separate "plpgsql_standard_conforming_strings" parameter. Just changing it
immediately is perhaps a bit risky.
We haven't even enabled standard_conforming_strings by default yet. It
was added as changeable in 8.2. Is this never going to be enabled by
default?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Peter Eisentraut wrote:
Tom Lane wrote:
plpgsql does not consider standard_conforming_strings --- it still uses
backslash escaping in its function bodies regardless. Since the
language itself is not standardized, I see no particular reason that
standard_conforming_strings should govern it.I think plpgsql should behave either consistently with the rest of PostgreSQL
or with Oracle, which it is copied from.I believe the reason for
not changing it was that it seemed too likely to break existing
functions, with potentially nasty consequences if they chanced to be
security definers.Is this actually true or did we just forget it? :-)
I would like to add a TODO item for this, but I am concerned that people
running functions with different standard_conforming_strings values
would have function syntax errors on mismatch. Is that acceptable?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Peter Eisentraut wrote:
Tom Lane wrote:
plpgsql does not consider standard_conforming_strings --- it still uses
backslash escaping in its function bodies regardless. Since the
language itself is not standardized, I see no particular reason that
standard_conforming_strings should govern it.I think plpgsql should behave either consistently with the rest of PostgreSQL
or with Oracle, which it is copied from.I believe the reason for
not changing it was that it seemed too likely to break existing
functions, with potentially nasty consequences if they chanced to be
security definers.Is this actually true or did we just forget it? :-)
Did we ever address this?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Peter Eisentraut wrote:
Tom Lane wrote:
plpgsql does not consider standard_conforming_strings --- it still uses
backslash escaping in its function bodies regardless. Since the
language itself is not standardized, I see no particular reason that
standard_conforming_strings should govern it.I think plpgsql should behave either consistently with the rest of PostgreSQL
or with Oracle, which it is copied from.I believe the reason for
not changing it was that it seemed too likely to break existing
functions, with potentially nasty consequences if they chanced to be
security definers.Is this actually true or did we just forget it? :-)
I have added this TODO item:
Consider honoring standard_conforming_strings in PL/pgSQL function
bodies
* http://archives.postgresql.org/pgsql-bugs/2008-03/msg00102.php
Are we every going to enable standard_conforming_strings by default? If
not, I will remove the TODO item mentiong this.
standard_conforming_strings was added in Postgres 8.1, and
escape_string_warning was enabled in 8.2.
I think the big issue is that having standard_conforming_strings affect
function behavior introduces the same problems we have had in the past
of having a GUC affect function behavior.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Thu, Apr 9, 2009 at 11:16 AM, Bruce Momjian <bruce@momjian.us> wrote:
Peter Eisentraut wrote:
Tom Lane wrote:
plpgsql does not consider standard_conforming_strings --- it still uses
backslash escaping in its function bodies regardless. Since the
language itself is not standardized, I see no particular reason that
standard_conforming_strings should govern it.I think plpgsql should behave either consistently with the rest of PostgreSQL
or with Oracle, which it is copied from.I believe the reason for
not changing it was that it seemed too likely to break existing
functions, with potentially nasty consequences if they chanced to be
security definers.Is this actually true or did we just forget it? :-)
I have added this TODO item:
Consider honoring standard_conforming_strings in PL/pgSQL function
bodies* http://archives.postgresql.org/pgsql-bugs/2008-03/msg00102.php
Are we every going to enable standard_conforming_strings by default? If
not, I will remove the TODO item mentiong this.
standard_conforming_strings was added in Postgres 8.1, and
escape_string_warning was enabled in 8.2.I think the big issue is that having standard_conforming_strings affect
function behavior introduces the same problems we have had in the past
of having a GUC affect function behavior.
I think this should wait at least one more release. Based on my
experience, there are probably a LOT of applications out there that
have yet to be updated.
It wouldn't bother me if we never enabled it by default, either. I'm
just -1 on doing it now.
...Robert
Bruce Momjian <bruce@momjian.us> wrote:
standard_conforming_strings was added in Postgres 8.1, and
escape_string_warning was enabled in 8.2.
Other way around -- the warning was available in 8.1; the standard
character string literals were available in 8.2.
I think the big issue is that having standard_conforming_strings
affect function behavior introduces the same problems we have had in
the past of having a GUC affect function behavior.
Can't that be managed with this CREATE FUNCTION option?:
SET configuration_parameter { TO value | = value | FROM CURRENT }
I would like to see standard character string literals at least
available in PL/pgSQL, although I don't personally care whether it is
the default or whether I need to specify it with the above option.
Might it not confuse people to have this GUC behave differently than
others, though?
-Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
Bruce Momjian <bruce@momjian.us> wrote:
I think the big issue is that having standard_conforming_strings
affect function behavior introduces the same problems we have had in
the past of having a GUC affect function behavior.
Can't that be managed with this CREATE FUNCTION option?:
SET configuration_parameter { TO value | = value | FROM CURRENT }
It can be, the question is whether we're prepared to break everything
under the sun until people add that.
regards, tom lane
Tom Lane wrote:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
Bruce Momjian <bruce@momjian.us> wrote:
I think the big issue is that having standard_conforming_strings
affect function behavior introduces the same problems we have had in
the past of having a GUC affect function behavior.Can't that be managed with this CREATE FUNCTION option?:
SET configuration_parameter { TO value | = value | FROM CURRENT }It can be, the question is whether we're prepared to break everything
under the sun until people add that.
I think we would first have to agree to issue escape_string_warning
warnings for code in PL/pgSQL functions, then think about having
standard_conforming_strings control PL/pgSQL behavior; this is what we
did with SQL and it seems to have worked.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
Can't that be managed with this CREATE FUNCTION option?:
SET configuration_parameter { TO value | = value | FROM CURRENT }It can be, the question is whether we're prepared to break
everything
under the sun until people add that.
Well, surely the 8.3 behavior is not what we want.
(1) The plpgsql parser identifies the boundaries of the string based
on backslash escapes.
(2) The character string literal is interpreted based on the GUC
setting the first time the function is executed (and presumably the
first time executed after the function's invalidated).
(3) Subsequent changes to the GUC don't affect how it's interpreted.
scca=# show standard_conforming_strings ;
standard_conforming_strings
-----------------------------
on
(1 row)
scca=# create or replace function kjgtest() returns text language
plpgsql immutable strict as $$ begin return '\x49'; end; $$;
CREATE FUNCTION
scca=# select * from kjgtest();
kjgtest
---------
\x49
(1 row)
scca=# set standard_conforming_strings = off;
SET
scca=# select * from kjgtest();
kjgtest
---------
\x49
(1 row)
scca=# create or replace function kjgtest() returns text language
plpgsql immutable strict as $$ begin return '\x49'; end; $$;
CREATE FUNCTION
scca=# select * from kjgtest();
kjgtest
---------
I
(1 row)
scca=# set standard_conforming_strings = on;
SET
scca=# select * from kjgtest();
kjgtest
---------
I
(1 row)
scca=# create or replace function kjgtest() returns text language
plpgsql immutable strict as $$ begin return '\x49'; end; $$;
CREATE FUNCTION
scca=# set standard_conforming_strings = off;
SET
scca=# select * from kjgtest();
kjgtest
---------
I
(1 row)
-Kevin
Bruce Momjian <bruce@momjian.us> writes:
Tom Lane wrote:
It can be, the question is whether we're prepared to break everything
under the sun until people add that.
I think we would first have to agree to issue escape_string_warning
warnings for code in PL/pgSQL functions, then think about having
standard_conforming_strings control PL/pgSQL behavior; this is what we
did with SQL and it seems to have worked.
Well, considering that we are still afraid to pull the trigger on
changing the standard_conforming_strings default, it's a bit premature
to claim that it "worked" for SQL. But I agree that some kind of
stepwise process will be necessary if we want to try to change this.
IIRC there was some discussion of using plpgsql's (undocumented) #option
syntax to control this, rather than having a GUC that would be specific
to plpgsql.
regards, tom lane
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
Well, surely the 8.3 behavior is not what we want.
Unless I'm missing something, plpgsql *already* effectively recognizes
and respects the standard_conforming_strings GUC *except* as the last
character of a conforming string literal within the procedure body,
and then not always. Am I missing something here?
scca=# set standard_conforming_strings = on;
SET
scca=# create or replace function kjgtest() returns text language
plpgsql immutable as 'begin return \'\x49\'; end;';
Expanded display is on.
Invalid command \';. Try \? for help.
scca=# \x
Expanded display is off.
scca-# create or replace function kjgtest() returns text language
plpgsql immutable as $$ begin return '\x49\'; end; $$;
ERROR: syntax error at or near "create"
LINE 2: create or replace function kjgtest() returns text language
p...
^
scca=# create or replace function kjgtest() returns text language
plpgsql immutable as $$ begin return '\x49\\'; end; $$;
CREATE FUNCTION
scca=# select kjgtest();
kjgtest
---------
\x49\\
(1 row)
scca=# set standard_conforming_strings = off;
SET
scca=# create or replace function kjgtest() returns text language
plpgsql immutable as 'begin return \'\x49\'; end;';
CREATE FUNCTION
scca=# select kjgtest();
kjgtest
---------
I
(1 row)
scca=# create or replace function kjgtest() returns text language
plpgsql immutable as $$ begin return '\x49\'; end; $$;
ERROR: unterminated string
CONTEXT: compile of PL/pgSQL function "kjgtest" near line 1
scca=# create or replace function kjgtest() returns text language
plpgsql immutable as $$ begin return '\x49\\'; end; $$;
CREATE FUNCTION
scca=# select kjgtest();
kjgtest
---------
I\
(1 row)
Given this behavior, how much could be working for
standard_conforming_strings = on which would break with more complete
support? Maybe this particular GUC should default to an implied SET
standard_conforming_strings FROM CURRENT and the plpgsql parser should
use it? Can anyone show a working case that would break with that?
-Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
Unless I'm missing something, plpgsql *already* effectively recognizes
and respects the standard_conforming_strings GUC *except* as the last
character of a conforming string literal within the procedure body,
and then not always. Am I missing something here?
Yes --- I think you are confusing parsing of the string literal that
is the argument of CREATE FUNCTION with the parsing that the plpgsql
interpreter does on the function body once it gets it. In particular,
this example:
create or replace function kjgtest() returns text language
plpgsql immutable as $$ begin return 'foo\'; end; $$;
fails regardless of the standard_conforming_strings setting, because
the plpgsql interpreter considers the backslash to escape the quote
regardless.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think you are confusing parsing of the string literal that
is the argument of CREATE FUNCTION with the parsing that the plpgsql
interpreter does on the function body once it gets it. In
particular, this example:create or replace function kjgtest() returns text language
plpgsql immutable as $$ begin return 'foo\'; end; $$;fails regardless of the standard_conforming_strings setting, because
the plpgsql interpreter considers the backslash to escape the quote
regardless.
Oh, I'm not confused about that at all. I'm arguing that it's a bad
idea. I agree with the OP that this is a bug. Did you look at my
other examples of behavior? In particular:
scca=# create or replace function kjgtest() returns text language
plpgsql immutable as $$ begin return '\x49\\'; end; $$;
CREATE FUNCTION
scca=# select kjgtest();
kjgtest
---------
\x49\\
(1 row)
Can you show one case where having plgpsql parse the function body
based on the standard_conforming_strings GUC would break *anything*
that now works? That's an allegation which I haven't been able to
confirm, so I'm wondering about the basis.
-Kevin
I wrote:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think you are confusing parsing of the string literal that
is the argument of CREATE FUNCTION with the parsing that the
plpgsql
interpreter does on the function body once it gets it.
Oh, I'm not confused about that at all. I'm arguing that it's a bad
idea.
To be more explicit, I see that there is a third parser phase -- when
the function is planned, the original contents of the character string
literal are passed to the normal PostgreSQL execution engine, which
parses them again, potentially using different rules from those used
by the plpgsql interpreter. I maintain that having the execution
engine use different rules for looking at the value of the literal
than the plpgsql parser used to find the boundaries of the literal
is where the weird corner case bugs come in.
For someone using string literal '\x49\\' in a plpgsql function, the
plpgsql parser sees it as a two character string, but when the
function is actually run, depending on whether the first execution is
using standard string literals, this can be either a two character or
a six character string. Unless the coder of the function uses the SET
option in declaring the function, they don't know what value will be
used at run time, and it may change from run to run.
It seems to me that we already have exactly the kinds of problems you
say you want to avoid, and that there is an obvious fix to avoid them.
-Kevin