Backslashes in string literals

Started by Kevin Grittnerabout 20 years ago13 messages
#1Kevin Grittner
Kevin.Grittner@wicourts.gov
1 attachment(s)

I've just been bitten by the "backslash in string literals" issue. I
have reviewed the mailing lists and the TODO list. I see that the
direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch. It's probably somewhat naive, but
maybe it can be helpful.

-Kevin

Attachments:

patch.txtapplication/octet-stream; name=patch.txtDownload
Index: configure
===================================================================
RCS file: /projects/cvsroot/pgsql/configure,v
retrieving revision 1.466
diff -c -r1.466 configure
*** configure	6 Dec 2005 18:35:08 -0000	1.466
--- configure	8 Dec 2005 17:08:44 -0000
***************
*** 860,865 ****
--- 860,866 ----
    --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
    --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
    --enable-integer-datetimes  enable 64-bit integer date/time support
+   --enable-standard-strings enable strings without backslash as an escape
    --enable-nls[=LANGUAGES]  enable Native Language Support
    --disable-shared        do not build shared libraries
    --disable-rpath         do not embed shared library search path in executables
***************
*** 1728,1733 ****
--- 1729,1772 ----
  
  
  #
+ # standard conforming strings (--enable-standard-strings)
+ #
+ echo "$as_me:$LINENO: checking whether to build with standard conforming strings" >&5
+ echo $ECHO_N "checking whether to build with standard conforming strings... $ECHO_C" >&6
+ 
+ 
+ # Check whether --enable-standard-strings or --disable-standard-strings was given.
+ if test "${enable_standard_strings+set}" = set; then
+   enableval="$enable_standard_strings"
+ 
+   case $enableval in
+     yes)
+ 
+ cat >>confdefs.h <<\_ACEOF
+ #define USE_STANDARD_CONFORMING_STRINGS 1
+ _ACEOF
+ 
+       ;;
+     no)
+       :
+       ;;
+     *)
+       { { echo "$as_me:$LINENO: error: no argument expected for --enable-standard-strings option" >&5
+ echo "$as_me: error: no argument expected for --enable-standard-strings option" >&2;}
+    { (exit 1); exit 1; }; }
+       ;;
+   esac
+ 
+ else
+   enable_standard_strings=no
+ 
+ fi;
+ 
+ echo "$as_me:$LINENO: result: $enable_standard_strings" >&5
+ echo "${ECHO_T}$enable_standard_strings" >&6
+ 
+ 
+ #
  # NLS
  #
  echo "$as_me:$LINENO: checking whether NLS is wanted" >&5
Index: configure.in
===================================================================
RCS file: /projects/cvsroot/pgsql/configure.in,v
retrieving revision 1.436
diff -c -r1.436 configure.in
*** configure.in	6 Dec 2005 18:35:09 -0000	1.436
--- configure.in	8 Dec 2005 17:08:44 -0000
***************
*** 149,154 ****
--- 149,164 ----
  
  
  #
+ # standard conforming strings (--enable-standard-strings)
+ #
+ AC_MSG_CHECKING([whether to build with standard conforming strings])
+ PGAC_ARG_BOOL(enable, standard-strings, no, [  --enable-standard-strings  enable standard conforming strings],
+               [AC_DEFINE([USE_STANDARD_CONFORMING_STRINGS], 1,
+                          [Define to 1 if you want standard conforming strings. (--enable-standard-strings)])])
+ AC_MSG_RESULT([$enable_standard_strings])
+ 
+ 
+ #
  # NLS
  #
  AC_MSG_CHECKING([whether NLS is wanted])
Index: src/backend/parser/scan.l
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/parser/scan.l,v
retrieving revision 1.128
diff -c -r1.128 scan.l
*** src/backend/parser/scan.l	16 Aug 2005 00:48:12 -0000	1.128
--- src/backend/parser/scan.l	8 Dec 2005 17:08:45 -0000
***************
*** 50,55 ****
--- 50,58 ----
  static int		xcdepth = 0;	/* depth of nesting in slash-star comments */
  static char    *dolqstart;      /* current $foo$ quote start string */
  
+ #ifdef USE_STANDARD_CONFORMING_STRINGS
+ static bool 	standard_string;
+ #else
  /*
   * GUC variable.  This is a DIRECT violation of the warning given at the
   * head of gram.y, ie flex/bison code must not depend on any GUC variables;
***************
*** 60,65 ****
--- 63,69 ----
  bool			escape_string_warning;
  
  static bool		warn_on_first_escape;
+ #endif
  
  /*
   * literalbuf is used to accumulate literal values when multiple rules
***************
*** 76,82 ****
--- 80,88 ----
  static void addlitchar(unsigned char ychar);
  static char *litbufdup(void);
  static int	pg_err_position(void);
+ #ifndef USE_STANDARD_CONFORMING_STRINGS
  static void check_escape_warning(void);
+ #endif
  
  /*
   * When we parse a token that requires multiple lexer rules to process,
***************
*** 301,309 ****
   * Dollar quoted strings are totally opaque, and no escaping is done on them.
   * Other quoted strings must allow some special characters such as single-quote
   *  and newline.
!  * Embedded single-quotes are implemented both in the SQL standard
!  *  style of two adjacent single quotes "''" and in the Postgres/Java style
!  *  of escaped-quote "\'".
   * Other embedded escaped characters are matched explicitly and the leading
   *  backslash is dropped from the string.
   * Note that xcstart must appear before operator, as explained above!
--- 307,316 ----
   * Dollar quoted strings are totally opaque, and no escaping is done on them.
   * Other quoted strings must allow some special characters such as single-quote
   *  and newline.
!  * Standard quoted strings allow a single-quote to be represented in the SQL
!  *  standard style of two adjacent single quotes "''".
!  * Extended quoted strings support embedded single-quotes in both in the SQL
!  *  standard style and in the Postgres/Java style of escaped-quote "\'".
   * Other embedded escaped characters are matched explicitly and the leading
   *  backslash is dropped from the string.
   * Note that xcstart must appear before operator, as explained above!
***************
*** 426,438 ****
--- 433,453 ----
  				}
  
  {xqstart}		{
+ #ifdef USE_STANDARD_CONFORMING_STRINGS
+ 					standard_string = true;
+ #else
  					warn_on_first_escape = true;
+ #endif
  					token_start = yytext;
  					BEGIN(xq);
  					startlit();
  				}
  {xestart}		{
+ #ifdef USE_STANDARD_CONFORMING_STRINGS
+ 					standard_string = false;
+ #else
  					warn_on_first_escape = false;
+ #endif
  					token_start = yytext;
  					BEGIN(xq);
  					startlit();
***************
*** 451,456 ****
--- 466,477 ----
  					addlit(yytext, yyleng);
  				}
  <xq>{xqescape}  {
+ #ifdef USE_STANDARD_CONFORMING_STRINGS
+ 					if (standard_string)
+ 						addlit(yytext, yyleng);
+ 					else
+ 						addlitchar(unescape_single_char(yytext[1]));
+ #else
  					if (yytext[1] == '\'')
  					{
  						if (warn_on_first_escape && escape_string_warning)
***************
*** 474,491 ****
--- 495,527 ----
  					else
  						check_escape_warning();
  					addlitchar(unescape_single_char(yytext[1]));
+ #endif
  				}
  <xq>{xqoctesc}  {
+ #ifdef USE_STANDARD_CONFORMING_STRINGS
+ 					if (standard_string)
+ 						addlit(yytext, yyleng);
+ 					else
+ 						addlitchar(strtoul(yytext+1, NULL, 8));
+ #else
  					unsigned char c = strtoul(yytext+1, NULL, 8);
  
  					check_escape_warning();
  					addlitchar(c);
+ #endif
  				}
  <xq>{xqhexesc}  {
+ #ifdef USE_STANDARD_CONFORMING_STRINGS
+ 					if (standard_string)
+ 						addlit(yytext, yyleng);
+ 					else
+ 						addlitchar(strtoul(yytext+2, NULL, 16));
+ #else
  					unsigned char c = strtoul(yytext+2, NULL, 16);
  
  					check_escape_warning();
  					addlitchar(c);
+ #endif
  				}
  <xq>{quotecontinue} {
  					/* ignore */
***************
*** 874,880 ****
  			return c;
  	}
  }
! 
  static void
  check_escape_warning(void)
  {
--- 910,916 ----
  			return c;
  	}
  }
! #ifndef USE_STANDARD_CONFORMING_STRINGS
  static void
  check_escape_warning(void)
  {
***************
*** 886,888 ****
--- 922,925 ----
  				 errposition(pg_err_position())));
  	warn_on_first_escape = false;	/* warn only once per string */
  }
+ #endif
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.301
diff -c -r1.301 guc.c
*** src/backend/utils/misc/guc.c	22 Nov 2005 18:17:26 -0000	1.301
--- src/backend/utils/misc/guc.c	8 Dec 2005 17:08:46 -0000
***************
*** 957,963 ****
  		&pg_krb_caseins_users,
  		false, NULL, NULL
  	},
! 
  	{
  		{"escape_string_warning", PGC_USERSET, COMPAT_OPTIONS_PREVIOUS,
  			gettext_noop("Warn about backslash escapes in ordinary string literals."),
--- 957,963 ----
  		&pg_krb_caseins_users,
  		false, NULL, NULL
  	},
! #ifndef USE_STANDARD_CONFORMING_STRINGS
  	{
  		{"escape_string_warning", PGC_USERSET, COMPAT_OPTIONS_PREVIOUS,
  			gettext_noop("Warn about backslash escapes in ordinary string literals."),
***************
*** 966,972 ****
  		&escape_string_warning,
  		false, NULL, NULL
  	},
! 
  	{
  		{"standard_conforming_strings", PGC_INTERNAL, PRESET_OPTIONS,
  			gettext_noop("'...' strings treat backslashes literally."),
--- 966,972 ----
  		&escape_string_warning,
  		false, NULL, NULL
  	},
! #endif
  	{
  		{"standard_conforming_strings", PGC_INTERNAL, PRESET_OPTIONS,
  			gettext_noop("'...' strings treat backslashes literally."),
***************
*** 974,980 ****
--- 974,984 ----
  			GUC_REPORT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
  		},
  		&standard_conforming_strings,
+ #ifdef USE_STANDARD_CONFORMING_STRINGS
+ 		true, NULL, NULL
+ #else
  		false, NULL, NULL
+ #endif
  	},
  
  	/* End-of-list marker */
Index: src/include/pg_config.h.in
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/pg_config.h.in,v
retrieving revision 1.88
diff -c -r1.88 pg_config.h.in
*** src/include/pg_config.h.in	6 Dec 2005 02:29:03 -0000	1.88
--- src/include/pg_config.h.in	8 Dec 2005 17:08:46 -0000
***************
*** 633,638 ****
--- 633,642 ----
     (--enable-integer-datetimes) */
  #undef USE_INTEGER_DATETIMES
  
+ /* Define to 1 if you want standard conforming strings.
+    (--enable-standard-strings) */
+ #undef USE_STANDARD_CONFORMING_STRINGS
+ 
  /* Define to select named POSIX semaphores. */
  #undef USE_NAMED_POSIX_SEMAPHORES
  
Index: src/include/utils/guc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/guc.h,v
retrieving revision 1.63
diff -c -r1.63 guc.h
*** src/include/utils/guc.h	15 Oct 2005 02:49:46 -0000	1.63
--- src/include/utils/guc.h	8 Dec 2005 17:08:46 -0000
***************
*** 120,126 ****
--- 120,128 ----
  extern bool Australian_timezones;
  
  extern bool default_with_oids;
+ #ifndef USE_STANDARD_CONFORMING_STRINGS
  extern bool escape_string_warning;
+ #endif
  
  extern int	log_min_error_statement;
  extern int	log_min_messages;
#2Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Kevin Grittner (#1)
Re: Backslashes in string literals

I think we we will be turning on escape_string_warning in 8.2 and allow
standard_conforming_strings to be optionally turned on in that releaes.
I will keep the patch for us in completing that item.

This has been saved for the 8.2 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Kevin Grittner wrote:

I've just been bitten by the "backslash in string literals" issue. I
have reviewed the mailing lists and the TODO list. I see that the
direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch. It's probably somewhat naive, but
maybe it can be helpful.

-Kevin

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3Peter Eisentraut
peter_e@gmx.net
In reply to: Kevin Grittner (#1)
Re: Backslashes in string literals

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch.

There is already a run-time configuration option
standard_conforming_strings which does what you seem to have in mind.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#3)
Re: Backslashes in string literals

Peter Eisentraut wrote:

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch.

There is already a run-time configuration option
standard_conforming_strings which does what you seem to have in mind.

Yes, but right now it is read-only. We didn't have time to allow it to
be set to true in this release. I think it has to wait for 8.2.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Peter Eisentraut (#3)
Re: Backslashes in string literals

On Fri, Dec 9, 2005 at 11:24 am, in message

<200512091824.28760.peter_e@gmx.net>, Peter Eisentraut
<peter_e@gmx.net> wrote:

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a

patch

which supports this as a configure option, using a
-- enable- standard- strings switch.

There is already a run- time configuration option
standard_conforming_strings which does what you seem to have in

mind.

As Bruce has mentioned, this is currently read-only, set to off.

I needed something fast, and I could see a way to do it quickly with a
configure switch, to compile it for standard behavior. Since the
non-standard behavior is in the lexer, I couldn't see any reasonable way
to base it on a runtime switch. I'm curious what is intended here. Can
anyone give a one-paragraph explanation of how this configuration option
will work?

-Kevin

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Kevin Grittner (#5)
Re: Backslashes in string literals

Kevin Grittner wrote:

On Fri, Dec 9, 2005 at 11:24 am, in message

<200512091824.28760.peter_e@gmx.net>, Peter Eisentraut
<peter_e@gmx.net> wrote:

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a

patch

which supports this as a configure option, using a
-- enable- standard- strings switch.

There is already a run- time configuration option
standard_conforming_strings which does what you seem to have in

mind.

As Bruce has mentioned, this is currently read-only, set to off.

I needed something fast, and I could see a way to do it quickly with a
configure switch, to compile it for standard behavior. Since the
non-standard behavior is in the lexer, I couldn't see any reasonable way
to base it on a runtime switch. I'm curious what is intended here. Can
anyone give a one-paragraph explanation of how this configuration option
will work?

Have you read our documentation?

http://www.postgresql.org/docs/8.1/static/sql-syntax.html#SQL-SYNTAX-CONSTANTS
http://www.postgresql.org/docs/8.1/static/runtime-config-compatible.html#RUNTIME-CONFIG-COMPATIBLE-VERSION

Between those and the release notes, I don't know what additional
information you want. In the future you will set
standard_conforming_strings to on and backslashes will be treated
literally.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#7Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#6)
Re: Backslashes in string literals

On Sat, Dec 10, 2005 at 8:01 pm, in message

<200512110201.jBB21PE16562@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:

Kevin Grittner wrote:

Since the
non- standard behavior is in the lexer, I couldn't see any

reasonable way

to base it on a runtime switch. I'm curious what is intended here.

Can

anyone give a one- paragraph explanation of how this configuration

option

will work?

Have you read our documentation?
http://www.postgresql.org/docs/8.1/static/sql- syntax.html#SQL-

SYNTAX- CONSTANTS

http://www.postgresql.org/docs/8.1/static/runtime- config-

compatible.html#RUNTI

ME- CONFIG- COMPATIBLE- VERSION

Yes.

Between those and the release notes, I don't know what additional
information you want. In the future you will set
standard_conforming_strings to on and backslashes will be treated
literally.

Perhaps my language was ambiguous. I'm not curious about the intended
behavior from a user perspective, but what I might have missed in the
source code which would have allowed me to write my patch to better
comply with the documentation you cited. Since the problem is in the
lexer, the only way I could see to implement it as a run-time
configuration option, rather than a compile-time option, would be to
duplicate the lexer and maintain two sets of rules in parallel. I
generally try to avoid maintaining two parallel copies of code. I'm
curious whether I missed some other programming approach.

-Kevin

#8Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Kevin Grittner (#7)
Re: Backslashes in string literals

Kevin Grittner wrote:

Between those and the release notes, I don't know what additional
information you want. In the future you will set
standard_conforming_strings to on and backslashes will be treated
literally.

Perhaps my language was ambiguous. I'm not curious about the intended
behavior from a user perspective, but what I might have missed in the
source code which would have allowed me to write my patch to better
comply with the documentation you cited. Since the problem is in the
lexer, the only way I could see to implement it as a run-time
configuration option, rather than a compile-time option, would be to
duplicate the lexer and maintain two sets of rules in parallel. I
generally try to avoid maintaining two parallel copies of code. I'm
curious whether I missed some other programming approach.

Oh, that question. :-) We haven't looked yet at what it will require
to do this in the lexer, but I think what we will eventually do is to
add a more generalized filter to the lexer, and have the actions behave
differntly based on the boolean of whether we are using sql-standard
strings.

If you keep you eye on hackers or the committers messages you will see
when we commit the change for 8.2.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#2)
1 attachment(s)
Re: Backslashes in string literals

We found a bug in the code from my first patch. Since it was a low
frequency, non-destructive type of problem for us, I was able to take my
time and look over the task a little more closely. Attached is a patch
which should come close to implementing the TODO. In particular, it is
now implemented as a configurable option, which can be set in the
postgresql.conf file or at run time. There are some remaining issues:

(1) I couldn't figure out the best way to obtain a value for
standard_conforming_strings in the psql version of the scanner. For our
needs, could just assume it is always on, so I left it that way.
Someone with a better handle on this issue can hopefully finish that
part. Alternatively, if you give me some direction, I might have time
to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but psql
isn't settled down.

(2) There should probably be some tests added to exercise these
options.

(3) I took a quick shot at the documentation, but I'm sure I didn't
cover everything.

(4) I made the changes on REL8_1_STABLE, since we need it now. This
patch is relative to that branch, not the trunk.

I hope this is helpful. If there's anything I should have done
differently, please let me know, so I can try to do better next time.

-Kevin

On Fri, Dec 9, 2005 at 11:23 am, in message

<200512091723.jB9HNjB16785@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:

I think we we will be turning on escape_string_warning in 8.2 and

allow

standard_conforming_strings to be optionally turned on in that

releaes.

Show quoted text

I will keep the patch for us in completing that item.

This has been saved for the 8.2 release:

http://momjian.postgresql.org/cgi- bin/pgpatches_hold

Attachments:

configurable-unified-patch.txtapplication/octet-stream; name=configurable-unified-patch.txtDownload
Index: doc/TODO
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/TODO,v
retrieving revision 1.1696
diff -c -r1.1696 TODO
*** doc/TODO	27 Oct 2005 14:16:02 -0000	1.1696
--- doc/TODO	25 Jan 2006 22:22:56 -0000
***************
*** 391,397 ****
  * Allow EXPLAIN to identify tables that were skipped because of 
    constraint_exclusion
  * Allow EXPLAIN output to be more easily processed by scripts
- * Eventually enable escape_string_warning and standard_conforming_strings
  * Simplify dropping roles that have objects in several databases
  
  
--- 391,396 ----
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.36
diff -c -r1.36 config.sgml
*** doc/src/sgml/config.sgml	4 Nov 2005 23:53:18 -0000	1.36
--- doc/src/sgml/config.sgml	25 Jan 2006 22:22:57 -0000
***************
*** 3728,3737 ****
         </para>
         <para>
          Escape string syntax (<literal>E'...'</>) should be used for
!         escapes, because in future versions of
!         <productname>PostgreSQL</productname> ordinary strings will have
          the standard-conforming behavior of treating backslashes
!         literally.
         </para>
        </listitem>
       </varlistentry>
--- 3728,3755 ----
         </para>
         <para>
          Escape string syntax (<literal>E'...'</>) should be used for
!         backslash escape sequences, because ordinary strings have
          the standard-conforming behavior of treating backslashes
!         literally when the <literal>standard-conforming-strings</>
!         option is set <literal>on</>.
!        </para>
!       </listitem>
!      </varlistentry>
! 
!      <varlistentry id="guc-standard-conforming-strings" xreflabel="standard_conforming_strings">
!       <term><varname>standard_conforming_strings</varname> (<type>boolean</type>)</term>
!       <indexterm><primary>strings</><secondary>escape</></>
!       <indexterm>
!        <primary><varname>standard_conforming_strings</> configuration parameter</primary>
!       </indexterm>
!       <listitem>
!        <para>
!         Controls whether ordinary string literals
!         (<literal>'...'</>) treat backslashes literally, as specified in
!         the SQL standard.  Applications may check this
!         parameter to determine how string literals will be processed.
!         The presence of this parameter can also be taken as an indication
!         that the escape string syntax (<literal>E'...'</>) is supported.
         </para>
        </listitem>
       </varlistentry>
***************
*** 3944,3971 ****
        </listitem>
       </varlistentry>
  
-      <varlistentry id="guc-standard-conforming-strings" xreflabel="standard_conforming_strings">
-       <term><varname>standard_conforming_strings</varname> (<type>boolean</type>)</term>
-       <indexterm><primary>strings</><secondary>escape</></>
-       <indexterm>
-        <primary><varname>standard_conforming_strings</> configuration parameter</primary>
-       </indexterm>
-       <listitem>
-        <para>
-         Reports whether ordinary string literals
-         (<literal>'...'</>) treat backslashes literally, as specified in
-         the SQL standard.  The value is currently always <literal>off</>,
-         indicating that backslashes are treated as escapes.  It is planned
-         that this will change to <literal>on</> in a future
-         <productname>PostgreSQL</productname> release when string literal
-         syntax changes to meet the standard.  Applications may check this
-         parameter to determine how string literals will be processed.
-         The presence of this parameter can also be taken as an indication
-         that the escape string syntax (<literal>E'...'</>) is supported.
-        </para>
-       </listitem>
-      </varlistentry>
- 
      </variablelist>
     </sect1>
  
--- 3962,3967 ----
Index: src/backend/parser/scan.l
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/parser/scan.l,v
retrieving revision 1.128
diff -c -r1.128 scan.l
*** src/backend/parser/scan.l	16 Aug 2005 00:48:12 -0000	1.128
--- src/backend/parser/scan.l	25 Jan 2006 22:22:57 -0000
***************
*** 51,63 ****
  static char    *dolqstart;      /* current $foo$ quote start string */
  
  /*
!  * GUC variable.  This is a DIRECT violation of the warning given at the
   * head of gram.y, ie flex/bison code must not depend on any GUC variables;
   * as such, changing its value can induce very unintuitive behavior.
   * But we shall have to live with it as a short-term thing until the switch
   * to SQL-standard string syntax is complete.
   */
  bool			escape_string_warning;
  
  static bool		warn_on_first_escape;
  
--- 51,64 ----
  static char    *dolqstart;      /* current $foo$ quote start string */
  
  /*
!  * GUC variables.  This is a DIRECT violation of the warning given at the
   * head of gram.y, ie flex/bison code must not depend on any GUC variables;
   * as such, changing its value can induce very unintuitive behavior.
   * But we shall have to live with it as a short-term thing until the switch
   * to SQL-standard string syntax is complete.
   */
  bool			escape_string_warning;
+ bool			standard_conforming_strings;
  
  static bool		warn_on_first_escape;
  
***************
*** 77,82 ****
--- 78,84 ----
  static char *litbufdup(void);
  static int	pg_err_position(void);
  static void check_escape_warning(void);
+ static void check_string_escape_warning(unsigned char ychar);
  
  /*
   * When we parse a token that requires multiple lexer rules to process,
***************
*** 119,125 ****
   *  <xc> extended C-style comments
   *  <xd> delimited identifiers (double-quoted identifiers)
   *  <xh> hexadecimal numeric string
!  *  <xq> quoted strings
   *  <xdolq> $foo$ quoted strings
   */
  
--- 121,128 ----
   *  <xc> extended C-style comments
   *  <xd> delimited identifiers (double-quoted identifiers)
   *  <xh> hexadecimal numeric string
!  *  <xq> standard quoted strings
!  *  <xe> extended quoted strings (support backslash escape sequences)
   *  <xdolq> $foo$ quoted strings
   */
  
***************
*** 127,132 ****
--- 130,136 ----
  %x xc
  %x xd
  %x xh
+ %x xe
  %x xq
  %x xdolq
  
***************
*** 200,205 ****
--- 204,213 ----
  
  /* Quoted string that allows backslash escapes */
  xestart			[eE]{quote}
+ xeinside		[^\\']+
+ xeescape		[\\][^0-7]
+ xeoctesc		[\\][0-7]{1,3}
+ xehexesc		[\\]x[0-9A-Fa-f]{1,2}
  
  /* Extended quote
   * xqdouble implements embedded quote, ''''
***************
*** 207,215 ****
  xqstart			{quote}
  xqdouble		{quote}{quote}
  xqinside		[^\\']+
! xqescape		[\\][^0-7]
! xqoctesc		[\\][0-7]{1,3}
! xqhexesc		[\\]x[0-9A-Fa-f]{1,2}
  
  /* $foo$ style quotes ("dollar quoting")
   * The quoted string starts with $foo$ where "foo" is an optional string
--- 215,221 ----
  xqstart			{quote}
  xqdouble		{quote}{quote}
  xqinside		[^\\']+
! xqbackslash	[\\]
  
  /* $foo$ style quotes ("dollar quoting")
   * The quoted string starts with $foo$ where "foo" is an optional string
***************
*** 428,500 ****
  {xqstart}		{
  					warn_on_first_escape = true;
  					token_start = yytext;
! 					BEGIN(xq);
  					startlit();
  				}
  {xestart}		{
  					warn_on_first_escape = false;
  					token_start = yytext;
! 					BEGIN(xq);
  					startlit();
  				}
! <xq>{quotestop}	|
! <xq>{quotefail} {
  					yyless(1);
  					BEGIN(INITIAL);
  					yylval.str = litbufdup();
  					return SCONST;
  				}
! <xq>{xqdouble}  {
  					addlitchar('\'');
  				}
  <xq>{xqinside}  {
  					addlit(yytext, yyleng);
  				}
! <xq>{xqescape}  {
! 					if (yytext[1] == '\'')
! 					{
! 						if (warn_on_first_escape && escape_string_warning)
! 							ereport(WARNING,
! 									(errcode(ERRCODE_NONSTANDARD_USE_OF_ESCAPE_CHARACTER),
! 									 errmsg("nonstandard use of \\' in a string literal"),
! 									 errhint("Use '' to write quotes in strings, or use the escape string syntax (E'...')."),
! 									 errposition(pg_err_position())));
! 						warn_on_first_escape = false;	/* warn only once per string */
! 					}
! 					else if (yytext[1] == '\\')
! 					{
! 						if (warn_on_first_escape && escape_string_warning)
! 							ereport(WARNING,
! 									(errcode(ERRCODE_NONSTANDARD_USE_OF_ESCAPE_CHARACTER),
! 									 errmsg("nonstandard use of \\\\ in a string literal"),
! 									 errhint("Use the escape string syntax for backslashes, e.g., E'\\\\'."),
! 									 errposition(pg_err_position())));
! 						warn_on_first_escape = false;	/* warn only once per string */
! 					}
! 					else
! 						check_escape_warning();
  					addlitchar(unescape_single_char(yytext[1]));
  				}
! <xq>{xqoctesc}  {
  					unsigned char c = strtoul(yytext+1, NULL, 8);
  
  					check_escape_warning();
  					addlitchar(c);
  				}
! <xq>{xqhexesc}  {
  					unsigned char c = strtoul(yytext+2, NULL, 16);
  
  					check_escape_warning();
  					addlitchar(c);
  				}
! <xq>{quotecontinue} {
  					/* ignore */
  				}
! <xq>.			{
  					/* This is only needed for \ just before EOF */
  					addlitchar(yytext[0]);
  				}
! <xq><<EOF>>		{ yyerror("unterminated quoted string"); }
  
  {dolqdelim}		{
  					token_start = yytext;
--- 434,495 ----
  {xqstart}		{
  					warn_on_first_escape = true;
  					token_start = yytext;
! 					if (standard_conforming_strings)
! 						BEGIN(xq);
! 					else
! 						BEGIN(xe);
  					startlit();
  				}
  {xestart}		{
  					warn_on_first_escape = false;
  					token_start = yytext;
! 					BEGIN(xe);
  					startlit();
  				}
! <xq,xe>{quotestop}	|
! <xq,xe>{quotefail} {
  					yyless(1);
  					BEGIN(INITIAL);
  					yylval.str = litbufdup();
  					return SCONST;
  				}
! <xq,xe>{xqdouble}  {
  					addlitchar('\'');
  				}
  <xq>{xqinside}  {
  					addlit(yytext, yyleng);
  				}
! <xe>{xeinside}  {
! 					addlit(yytext, yyleng);
! 				}
! <xq>{xqbackslash} {
! 					check_string_escape_warning(yytext[1]);
! 					addlitchar('\\');
!                 }
! <xe>{xeescape}  {
! 					check_string_escape_warning(yytext[1]);
  					addlitchar(unescape_single_char(yytext[1]));
  				}
! <xe>{xeoctesc}  {
  					unsigned char c = strtoul(yytext+1, NULL, 8);
  
  					check_escape_warning();
  					addlitchar(c);
  				}
! <xe>{xehexesc}  {
  					unsigned char c = strtoul(yytext+2, NULL, 16);
  
  					check_escape_warning();
  					addlitchar(c);
  				}
! <xq,xe>{quotecontinue} {
  					/* ignore */
  				}
! <xe>.			{
  					/* This is only needed for \ just before EOF */
  					addlitchar(yytext[0]);
  				}
! <xq,xe><<EOF>>		{ yyerror("unterminated quoted string"); }
  
  {dolqdelim}		{
  					token_start = yytext;
***************
*** 876,881 ****
--- 871,903 ----
  }
  
  static void
+ check_string_escape_warning(unsigned char ychar)
+ {
+ 	if (ychar == '\'')
+ 	{
+ 		if (warn_on_first_escape && escape_string_warning)
+ 			ereport(WARNING,
+ 					(errcode(ERRCODE_NONSTANDARD_USE_OF_ESCAPE_CHARACTER),
+ 					 errmsg("nonstandard use of \\' in a string literal"),
+ 					 errhint("Use '' to write quotes in strings, or use the escape string syntax (E'...')."),
+ 					 errposition(pg_err_position())));
+ 		warn_on_first_escape = false;	/* warn only once per string */
+ 	}
+ 	else if (ychar == '\\')
+ 	{
+ 		if (warn_on_first_escape && escape_string_warning)
+ 			ereport(WARNING,
+ 					(errcode(ERRCODE_NONSTANDARD_USE_OF_ESCAPE_CHARACTER),
+ 					 errmsg("nonstandard use of \\\\ in a string literal"),
+ 					 errhint("Use the escape string syntax for backslashes, e.g., E'\\\\'."),
+ 					 errposition(pg_err_position())));
+ 		warn_on_first_escape = false;	/* warn only once per string */
+ 	}
+ 	else
+ 		check_escape_warning();
+ }
+ 
+ static void
  check_escape_warning(void)
  {
  	if (warn_on_first_escape && escape_string_warning)
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.299.2.1
diff -c -r1.299.2.1 guc.c
*** src/backend/utils/misc/guc.c	22 Nov 2005 18:23:24 -0000	1.299.2.1
--- src/backend/utils/misc/guc.c	25 Jan 2006 22:22:58 -0000
***************
*** 219,225 ****
  static int	max_identifier_length;
  static int	block_size;
  static bool integer_datetimes;
- static bool standard_conforming_strings;
  
  /* should be static, but commands/variable.c needs to get at these */
  char	   *role_string;
--- 219,224 ----
***************
*** 958,967 ****
  	},
  
  	{
! 		{"standard_conforming_strings", PGC_INTERNAL, PRESET_OPTIONS,
  			gettext_noop("'...' strings treat backslashes literally."),
! 			NULL,
! 			GUC_REPORT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
  		},
  		&standard_conforming_strings,
  		false, NULL, NULL
--- 957,965 ----
  	},
  
  	{
! 		{"standard_conforming_strings", PGC_USERSET, COMPAT_OPTIONS_PREVIOUS,
  			gettext_noop("'...' strings treat backslashes literally."),
! 			NULL
  		},
  		&standard_conforming_strings,
  		false, NULL, NULL
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.168.2.2
diff -c -r1.168.2.2 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	10 Nov 2005 14:02:59 -0000	1.168.2.2
--- src/backend/utils/misc/postgresql.conf.sample	25 Jan 2006 22:22:58 -0000
***************
*** 416,422 ****
  #regex_flavor = advanced		# advanced, extended, or basic
  #sql_inheritance = on
  #default_with_oids = off
! #escape_string_warning = off
  
  # - Other Platforms & Clients -
  
--- 416,423 ----
  #regex_flavor = advanced		# advanced, extended, or basic
  #sql_inheritance = on
  #default_with_oids = off
! #escape_string_warning = off		# warn about backslashes in string literals
! #standard_conforming_strings = off	# interpret string literals according to ANSI/ISO standards
  
  # - Other Platforms & Clients -
  
Index: src/bin/psql/psqlscan.l
===================================================================
RCS file: /projects/cvsroot/pgsql/src/bin/psql/psqlscan.l,v
retrieving revision 1.15
diff -c -r1.15 psqlscan.l
*** src/bin/psql/psqlscan.l	26 Jun 2005 19:16:06 -0000	1.15
--- src/bin/psql/psqlscan.l	25 Jan 2006 22:22:58 -0000
***************
*** 122,127 ****
--- 122,129 ----
  
  #define ECHO emit(yytext, yyleng)
  
+ bool standard_conforming_strings;
+ 
  %}
  
  %option 8bit
***************
*** 154,160 ****
   *  <xc> extended C-style comments
   *  <xd> delimited identifiers (double-quoted identifiers)
   *  <xh> hexadecimal numeric string
!  *  <xq> quoted strings
   *  <xdolq> $foo$ quoted strings
   */
  
--- 156,163 ----
   *  <xc> extended C-style comments
   *  <xd> delimited identifiers (double-quoted identifiers)
   *  <xh> hexadecimal numeric string
!  *  <xq> standard quoted strings
!  *  <xe> extended quoted strings (support backslash escape sequences)
   *  <xdolq> $foo$ quoted strings
   */
  
***************
*** 162,167 ****
--- 165,171 ----
  %x xc
  %x xd
  %x xh
+ %x xe
  %x xq
  %x xdolq
  /* Additional exclusive states for psql only: lex backslash commands */
***************
*** 244,249 ****
--- 248,257 ----
  
  /* Quoted string that allows backslash escapes */
  xestart			[eE]{quote}
+ xeinside		[^\\']+
+ xeescape		[\\][^0-7]
+ xeoctesc		[\\][0-7]{1,3}
+ xehexesc		[\\]x[0-9A-Fa-f]{1,2}
  
  /* Extended quote
   * xqdouble implements embedded quote, ''''
***************
*** 251,259 ****
  xqstart			{quote}
  xqdouble		{quote}{quote}
  xqinside		[^\\']+
! xqescape		[\\][^0-7]
! xqoctesc		[\\][0-7]{1,3}
! xqhexesc		[\\]x[0-9A-Fa-f]{1,2}
  
  /* $foo$ style quotes ("dollar quoting")
   * The quoted string starts with $foo$ where "foo" is an optional string
--- 259,265 ----
  xqstart			{quote}
  xqdouble		{quote}{quote}
  xqinside		[^\\']+
! xqbackslash	[\\]
  
  /* $foo$ style quotes ("dollar quoting")
   * The quoted string starts with $foo$ where "foo" is an optional string
***************
*** 448,485 ****
  				}
  
  {xqstart}		{
! 					BEGIN(xq);
  					ECHO;
  				}
  {xestart}		{
! 					BEGIN(xq);
  					ECHO;
  				}
! <xq>{quotestop}	|
! <xq>{quotefail} {
  					yyless(1);
  					BEGIN(INITIAL);
  					ECHO;
  				}
! <xq>{xqdouble}  {
  					ECHO;
  				}
  <xq>{xqinside}  {
  					ECHO;
  				}
! <xq>{xqescape}  {
  					ECHO;
  				}
! <xq>{xqoctesc}  {
  					ECHO;
  				}
! <xq>{xqhexesc}  {
  					ECHO;
  				}
! <xq>{quotecontinue} {
  					ECHO;
  				}
! <xq>.			{
  					/* This is only needed for \ just before EOF */
  					ECHO;
  				}
--- 454,504 ----
  				}
  
  {xqstart}		{
! 					/*
! 					 * if (standard_conforming_strings)
! 					 */
! 						BEGIN(xq);
! 					/*
! 					 * else
! 					 *	BEGIN(xe);
! 					 */
  					ECHO;
  				}
  {xestart}		{
! 					BEGIN(xe);
  					ECHO;
  				}
! <xq,xe>{quotestop}	|
! <xq,xe>{quotefail} {
  					yyless(1);
  					BEGIN(INITIAL);
  					ECHO;
  				}
! <xq,xe>{xqdouble}  {
  					ECHO;
  				}
  <xq>{xqinside}  {
  					ECHO;
  				}
! <xq>{xqbackslash} {
! 					ECHO;
!                 }
! <xe>{xeinside}  {
  					ECHO;
  				}
! <xe>{xeescape}  {
  					ECHO;
  				}
! <xe>{xeoctesc}  {
  					ECHO;
  				}
! <xe>{xehexesc}  {
  					ECHO;
  				}
! <xq,xe>{quotecontinue} {
! 					ECHO;
! 				}
! <xe>.			{
  					/* This is only needed for \ just before EOF */
  					ECHO;
  				}
***************
*** 858,870 ****
  "\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
  "\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
  
! {xqoctesc}		{
  					/* octal case */
  					appendPQExpBufferChar(output_buf,
  										  (char) strtol(yytext + 1, NULL, 8));
  				}
  
! {xqhexesc}		{
  					/* hex case */
  					appendPQExpBufferChar(output_buf,
  										  (char) strtol(yytext + 2, NULL, 16));
--- 877,889 ----
  "\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
  "\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
  
! {xeoctesc}		{
  					/* octal case */
  					appendPQExpBufferChar(output_buf,
  										  (char) strtol(yytext + 1, NULL, 8));
  				}
  
! {xehexesc}		{
  					/* hex case */
  					appendPQExpBufferChar(output_buf,
  										  (char) strtol(yytext + 2, NULL, 16));
***************
*** 1128,1133 ****
--- 1147,1156 ----
  					result = PSCAN_INCOMPLETE;
  					*prompt = PROMPT_SINGLEQUOTE;
  					break;
+ 				case xe:
+ 					result = PSCAN_INCOMPLETE;
+ 					*prompt = PROMPT_SINGLEQUOTE;
+ 					break;
  				case xdolq:
  					result = PSCAN_INCOMPLETE;
  					*prompt = PROMPT_DOLLARQUOTE;
Index: src/include/utils/guc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/guc.h,v
retrieving revision 1.63
diff -c -r1.63 guc.h
*** src/include/utils/guc.h	15 Oct 2005 02:49:46 -0000	1.63
--- src/include/utils/guc.h	25 Jan 2006 22:22:58 -0000
***************
*** 121,126 ****
--- 121,127 ----
  
  extern bool default_with_oids;
  extern bool escape_string_warning;
+ extern bool standard_conforming_strings;
  
  extern int	log_min_error_statement;
  extern int	log_min_messages;
Index: src/interfaces/ecpg/preproc/pgc.l
===================================================================
RCS file: /projects/cvsroot/pgsql/src/interfaces/ecpg/preproc/pgc.l,v
retrieving revision 1.137
diff -c -r1.137 pgc.l
*** src/interfaces/ecpg/preproc/pgc.l	5 Oct 2005 14:58:36 -0000	1.137
--- src/interfaces/ecpg/preproc/pgc.l	25 Jan 2006 22:22:58 -0000
***************
*** 30,35 ****
--- 30,36 ----
  static int		xcdepth = 0;	/* depth of nesting in slash-star comments */
  static char    *dolqstart;      /* current $foo$ quote start string */
  bool                    escape_string_warning;
+ bool                    standard_conforming_strings;
  static bool             warn_on_first_escape;
  
  /*
***************
*** 96,102 ****
   *	<xc> extended C-style comments - thomas 1997-07-12
   *	<xd> delimited identifiers (double-quoted identifiers) - thomas 1997-10-27
   *	<xh> hexadecimal numeric string - thomas 1997-11-16
!  *	<xq> quoted strings - thomas 1997-07-30
   *  	<xdolq> $foo$ quoted strings
   */
  
--- 97,104 ----
   *	<xc> extended C-style comments - thomas 1997-07-12
   *	<xd> delimited identifiers (double-quoted identifiers) - thomas 1997-10-27
   *	<xh> hexadecimal numeric string - thomas 1997-11-16
!  *	<xq> standard quoted strings - thomas 1997-07-30
!  *  <xe> extended quoted strings (support backslash escape sequences)
   *  	<xdolq> $foo$ quoted strings
   */
  
***************
*** 105,110 ****
--- 107,113 ----
  %x xd
  %x xdc
  %x xh
+ %x xe
  %x xq
  %x xdolq
  %x xpre
***************
*** 125,130 ****
--- 128,137 ----
  
  /* Quoted string that allows backslash escapes */
  xestart                 [eE]{quote}
+ xeinside                [^\\']+
+ xeescape	                [\\][^0-7]
+ xeoctesc	                [\\][0-7]{1,3}
+ xehexesc	                [\\]x[0-9A-Fa-f]{1,2}
  
  /* C version of hex number */
  xch			0[xX][0-9A-Fa-f]*
***************
*** 135,143 ****
  xqstart			{quote}
  xqdouble		{quote}{quote}
  xqinside		[^\\']+
! xqescape		[\\][^0-7]
! xqoctesc		[\\][0-7]{1,3}
! xqhexesc		[\\]x[0-9A-Fa-f]{1,2}
  
  /* $foo$ style quotes ("dollar quoting")
   * The quoted string starts with $foo$ where "foo" is an optional string
--- 142,148 ----
  xqstart			{quote}
  xqdouble		{quote}{quote}
  xqinside		[^\\']+
! xqbackslash	[\\]
  
  /* $foo$ style quotes ("dollar quoting")
   * The quoted string starts with $foo$ where "foo" is an optional string
***************
*** 405,447 ****
  				warn_on_first_escape = true;
  				token_start = yytext;
  				state_before = YYSTATE;
! 				BEGIN(xq);
  				startlit();
  			}
  <C,SQL>{xestart}	{
  				warn_on_first_escape = false;
  				token_start = yytext;
  				state_before = YYSTATE;
! 				BEGIN(xq);
  				startlit();
  			}
! <xq>{quotestop} |
! <xq>{quotefail}		{
  				yyless(1);
  				BEGIN(state_before);
  				yylval.str = mm_strdup(literalbuf);
  				return SCONST;
  			}
! <xq>{xqdouble}		{ addlitchar('\''); }
  <xq>{xqinside}		{ addlit(yytext, yyleng); }
! <xq>{xqescape}  	{ 
  				check_escape_warning();
  				addlit(yytext, yyleng);
  			}
! <xq>{xqoctesc}		{ 
  				check_escape_warning();
  				addlit(yytext, yyleng);
  			}
! <xq>{xqhexesc}		{ 
  				check_escape_warning();
  				addlit(yytext, yyleng);
  			}
! <xq>{quotecontinue}	{ /* ignore */ }
! <xq>.                   {
                                         /* This is only needed for \ just before EOF */
                                         addlitchar(yytext[0]);
                          }
! <xq><<EOF>>		{ mmerror(PARSE_ERROR, ET_FATAL, "Unterminated quoted string"); }
  <SQL>{dolqfailed}	{
  				/* throw back all but the initial "$" */
  				yyless(1);
--- 410,460 ----
  				warn_on_first_escape = true;
  				token_start = yytext;
  				state_before = YYSTATE;
! 				if (standard_conforming_strings)
! 					BEGIN(xq);
! 				else
! 					BEGIN(xe);
  				startlit();
  			}
  <C,SQL>{xestart}	{
  				warn_on_first_escape = false;
  				token_start = yytext;
  				state_before = YYSTATE;
! 				BEGIN(xe);
  				startlit();
  			}
! <xq,xe>{quotestop} |
! <xq,xe>{quotefail} {
  				yyless(1);
  				BEGIN(state_before);
  				yylval.str = mm_strdup(literalbuf);
  				return SCONST;
  			}
! <xq,xe>{xqdouble}		{ addlitchar('\''); }
  <xq>{xqinside}		{ addlit(yytext, yyleng); }
! <xe>{xeinside}		{ addlit(yytext, yyleng); }
! <xq>{xqbackslash} {
! 					check_escape_warning();
! 					addlitchar('\\');
!                 }
! <xe>{xeescape}  	{ 
  				check_escape_warning();
  				addlit(yytext, yyleng);
  			}
! <xe>{xeoctesc}		{ 
  				check_escape_warning();
  				addlit(yytext, yyleng);
  			}
! <xe>{xehexesc}		{ 
  				check_escape_warning();
  				addlit(yytext, yyleng);
  			}
! <xq,xe>{quotecontinue}	{ /* ignore */ }
! <xe>.                   {
                                         /* This is only needed for \ just before EOF */
                                         addlitchar(yytext[0]);
                          }
! <xq,xe><<EOF>>		{ mmerror(PARSE_ERROR, ET_FATAL, "Unterminated quoted string"); }
  <SQL>{dolqfailed}	{
  				/* throw back all but the initial "$" */
  				yyless(1);
#10Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#9)
1 attachment(s)
Re: Backslashes in string literals

On Wed, Jan 25, 2006 at 4:46 pm, in message

<43D7AB6B.EE98.0025.0@wicourts.gov>, "Kevin Grittner"
<Kevin.Grittner@wicourts.gov> wrote:

(2) There should probably be some tests added to exercise these
options.

Attached is a patch to address this one. Note that until psql is
fixed, this test will fail. I manually generated a portion of the text
to match what I expect to get once psql is fixed, so there could be
typos.

-Kevin

Attachments:

test-string-patch.txtapplication/octet-stream; name=test-string-patch.txtDownload
Index: src/test/regress/expected/strings.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/strings.out,v
retrieving revision 1.26
diff -c -r1.26 strings.out
*** src/test/regress/expected/strings.out	10 Jul 2005 04:54:33 -0000	1.26
--- src/test/regress/expected/strings.out	26 Jan 2006 17:36:29 -0000
***************
*** 193,205 ****
  (1 row)
  
  -- PostgreSQL extension to allow using back reference in replace string;
! SELECT regexp_replace('1112223333', '(\\d{3})(\\d{3})(\\d{4})', '(\\1) \\2-\\3');
   regexp_replace 
  ----------------
   (111) 222-3333
  (1 row)
  
! SELECT regexp_replace('AAA   BBB   CCC   ', '\\s+', ' ', 'g');
   regexp_replace 
  ----------------
   AAA BBB CCC 
--- 193,205 ----
  (1 row)
  
  -- PostgreSQL extension to allow using back reference in replace string;
! SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})', E'(\\1) \\2-\\3');
   regexp_replace 
  ----------------
   (111) 222-3333
  (1 row)
  
! SELECT regexp_replace('AAA   BBB   CCC   ', E'\\s+', ' ', 'g');
   regexp_replace 
  ----------------
   AAA BBB CCC 
***************
*** 895,897 ****
--- 895,982 ----
   t
  (1 row)
  
+ --
+ -- test behavior of escape_string_warning and standard_conforming_strings options
+ --
+ set escape_string_warning = off;
+ set standard_conforming_strings = off;
+ show escape_string_warning;
+  escape_string_warning 
+ -----------------------
+  off
+ (1 row)
+ 
+ show standard_conforming_strings;
+  standard_conforming_strings 
+ -----------------------------
+  off
+ (1 row)
+ 
+ set escape_string_warning = on;
+ set standard_conforming_strings = on;
+ show escape_string_warning;
+  escape_string_warning 
+ -----------------------
+  on
+ (1 row)
+ 
+ show standard_conforming_strings;
+  standard_conforming_strings 
+ -----------------------------
+  on
+ (1 row)
+ 
+ select 'a\bcd' as f1, 'a\b''cd' as f2, 'a\b''''cd' as f3, 'abcd\'   as f4, 'ab\''cd' as f5, '\\' as f6;
+ WARNING:  nonstandard use of escape in a string literal at character 8
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of escape in a string literal at character 23
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of escape in a string literal at character 40
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of escape in a string literal at character 59
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of escape in a string literal at character 76
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of escape in a string literal at character 93
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+   f1   |   f2   |   f3    |  f4   |   f5   | f6 
+ -------+--------+---------+-------+--------+----
+  a\bcd | a\b'cd | a\b''cd | abcd\ | ab\'cd | \\
+ (1 row)
+ 
+ set standard_conforming_strings = off;
+ select 'a\\bcd' as f1, 'a\\b\'cd' as f2, 'a\\b\'''cd' as f3, 'abcd\\'   as f4, 'ab\\\'cd' as f5, '\\\\' as f6;
+ invalid command \
+ set escape_string_warning = off;
+ WARNING:  nonstandard use of \\ in a string literal at character 8
+ HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
+ WARNING:  nonstandard use of \\ in a string literal at character 24
+ HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
+ WARNING:  nonstandard use of \\ in a string literal at character 42
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of \\ in a string literal at character 62
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of \\ in a string literal at character 80
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+ WARNING:  nonstandard use of \\ in a string literal at character 98
+ HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
+   f1   |   f2   |   f3    |  f4   |   f5   | f6 
+ -------+--------+---------+-------+--------+----
+  a\bcd | a\b'cd | a\b''cd | abcd\ | ab\'cd | \\
+ (1 row)
+ 
+ set escape_string_warning = off;
+ set standard_conforming_strings = on;
+ select 'a\bcd' as f1, 'a\b''cd' as f2, 'a\b''''cd' as f3, 'abcd\'   as f4, 'ab\''cd' as f5, '\\' as f6;
+   f1   |   f2   |   f3    |  f4   |   f5   | f6 
+ -------+--------+---------+-------+--------+----
+  a\bcd | a\b'cd | a\b''cd | abcd\ | ab\'cd | \\
+ (1 row)
+ 
+ set standard_conforming_strings = off;
+ select 'a\\bcd' as f1, 'a\\b\'cd' as f2, 'a\\b\'''cd' as f3, 'abcd\\'   as f4, 'ab\\\'cd' as f5, '\\\\' as f6;
+   f1   |   f2   |   f3    |  f4   |   f5   | f6 
+ -------+--------+---------+-------+--------+----
+  a\bcd | a\b'cd | a\b''cd | abcd\ | ab\'cd | \\
+ (1 row)
+ 
Index: src/test/regress/sql/strings.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/sql/strings.sql,v
retrieving revision 1.17
diff -c -r1.17 strings.sql
*** src/test/regress/sql/strings.sql	10 Jul 2005 04:54:33 -0000	1.17
--- src/test/regress/sql/strings.sql	26 Jan 2006 17:36:29 -0000
***************
*** 81,88 ****
  SELECT SUBSTRING('abcdefg' FROM 'b(.*)f') AS "cde";
  
  -- PostgreSQL extension to allow using back reference in replace string;
! SELECT regexp_replace('1112223333', '(\\d{3})(\\d{3})(\\d{4})', '(\\1) \\2-\\3');
! SELECT regexp_replace('AAA   BBB   CCC   ', '\\s+', ' ', 'g');
  SELECT regexp_replace('AAA', '^|$', 'Z', 'g');
  SELECT regexp_replace('AAA aaa', 'A+', 'Z', 'gi');
  -- invalid option of REGEXP_REPLACE
--- 81,88 ----
  SELECT SUBSTRING('abcdefg' FROM 'b(.*)f') AS "cde";
  
  -- PostgreSQL extension to allow using back reference in replace string;
! SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})', E'(\\1) \\2-\\3');
! SELECT regexp_replace('AAA   BBB   CCC   ', E'\\s+', ' ', 'g');
  SELECT regexp_replace('AAA', '^|$', 'Z', 'g');
  SELECT regexp_replace('AAA aaa', 'A+', 'Z', 'gi');
  -- invalid option of REGEXP_REPLACE
***************
*** 352,354 ****
--- 352,384 ----
  select md5('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'::bytea) = 'd174ab98d277d9f5a5611c2c9f419d9f' AS "TRUE";
  
  select md5('12345678901234567890123456789012345678901234567890123456789012345678901234567890'::bytea) = '57edf4a22be3c955ac49da2e2107b67a' AS "TRUE";
+ 
+ --
+ -- test behavior of escape_string_warning and standard_conforming_strings options
+ --
+ set escape_string_warning = off;
+ set standard_conforming_strings = off;
+ 
+ show escape_string_warning;
+ show standard_conforming_strings;
+ 
+ set escape_string_warning = on;
+ set standard_conforming_strings = on;
+ 
+ show escape_string_warning;
+ show standard_conforming_strings;
+ 
+ select 'a\bcd' as f1, 'a\b''cd' as f2, 'a\b''''cd' as f3, 'abcd\'   as f4, 'ab\''cd' as f5, '\\' as f6;
+ 
+ set standard_conforming_strings = off;
+ 
+ select 'a\\bcd' as f1, 'a\\b\'cd' as f2, 'a\\b\'''cd' as f3, 'abcd\\'   as f4, 'ab\\\'cd' as f5, '\\\\' as f6;
+ 
+ set escape_string_warning = off;
+ set standard_conforming_strings = on;
+ 
+ select 'a\bcd' as f1, 'a\b''cd' as f2, 'a\b''''cd' as f3, 'abcd\'   as f4, 'ab\''cd' as f5, '\\' as f6;
+ 
+ set standard_conforming_strings = off;
+ 
+ select 'a\\bcd' as f1, 'a\\b\'cd' as f2, 'a\\b\'''cd' as f3, 'abcd\\'   as f4, 'ab\\\'cd' as f5, '\\\\' as f6;
#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Kevin Grittner (#9)
Re: Backslashes in string literals

Kevin Grittner wrote:

We found a bug in the code from my first patch. Since it was a low
frequency, non-destructive type of problem for us, I was able to take my
time and look over the task a little more closely. Attached is a patch
which should come close to implementing the TODO. In particular, it is
now implemented as a configurable option, which can be set in the
postgresql.conf file or at run time. There are some remaining issues:

(1) I couldn't figure out the best way to obtain a value for
standard_conforming_strings in the psql version of the scanner. For our
needs, could just assume it is always on, so I left it that way.
Someone with a better handle on this issue can hopefully finish that
part. Alternatively, if you give me some direction, I might have time
to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but psql
isn't settled down.

Sounds like you made great progress!

The proper way to do (1) is to call libpq's pqSaveParameterStatus() from
psql. Take a look for psql's session_username(). It is called
everytime the prompt is printed if the username is required. One great
feature of using pqSaveParameterStatus() is that it reads server packets
and keeps the tracked value updated for you without query overhead.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#12Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#11)
Re: Backslashes in string literals

On Wed, Feb 1, 2006 at 10:50 am, in message

<200602011650.k11GoiU23147@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:

Kevin Grittner wrote:

We found a bug in the code from my first patch. Since it was a low
frequency, non- destructive type of problem for us, I was able to

take my

time and look over the task a little more closely. Attached is a

patch

which should come close to implementing the TODO. In particular, it

is

now implemented as a configurable option, which can be set in the
postgresql.conf file or at run time. There are some remaining

issues:

(1) I couldn't figure out the best way to obtain a value for
standard_conforming_strings in the psql version of the scanner. For

our

needs, could just assume it is always on, so I left it that way.
Someone with a better handle on this issue can hopefully finish

that

part. Alternatively, if you give me some direction, I might have

time

to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but

psql

isn't settled down.

Sounds like you made great progress!

Thanks. It was actually pretty easy once I took the time to learn
flex. I'd kinda winged it in my emergency compile-time version. I'm
pretty sure that what I've done works; my biggest concern is over what
I've missed. For example, I was using pg_dump and pg_restore today and
realized that these, and other applications, likely need some kind of
work to support the feature.

The proper way to do (1) is to call libpq's pqSaveParameterStatus()

from

psql. Take a look for psql's session_username(). It is called
everytime the prompt is printed if the username is required. One

great

feature of using pqSaveParameterStatus() is that it reads server

packets

and keeps the tracked value updated for you without query overhead.

I'll take a look at it. If I feel confident that I "get it", I'll do
the work and post another patch. Would you prefer that I resend the
whole works, or just the delta?

Also, since we're doing this out of need to fix the issue on our
production system, I'm compelled to work on the stable branch. Is it OK
to post patches from the tip of that branch, or should I really check
out the trunk (HEAD), too, and replicate it there for my patch posts?

-Kevin

#13Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Kevin Grittner (#12)
Re: Backslashes in string literals

Kevin Grittner wrote:

to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but

psql

isn't settled down.

Sounds like you made great progress!

Thanks. It was actually pretty easy once I took the time to learn
flex. I'd kinda winged it in my emergency compile-time version. I'm
pretty sure that what I've done works; my biggest concern is over what
I've missed. For example, I was using pg_dump and pg_restore today and
realized that these, and other applications, likely need some kind of
work to support the feature.

Right, I will look over the rest of the code and fix any places you
missed. I think most of it centers around ESCAPE_STRING_SYNTAX usage.

The proper way to do (1) is to call libpq's pqSaveParameterStatus()

from

psql. Take a look for psql's session_username(). It is called
everytime the prompt is printed if the username is required. One

great

feature of using pqSaveParameterStatus() is that it reads server

packets

and keeps the tracked value updated for you without query overhead.

I'll take a look at it. If I feel confident that I "get it", I'll do
the work and post another patch. Would you prefer that I resend the
whole works, or just the delta?

I would like the whole patch rather than just an additional one. Again,
I will review it and polish whatever you don't do.

Also, since we're doing this out of need to fix the issue on our
production system, I'm compelled to work on the stable branch. Is it OK
to post patches from the tip of that branch, or should I really check
out the trunk (HEAD), too, and replicate it there for my patch posts?

The branch is fine. I will merge any changes in to HEAD.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073