Backslashes in string literals

Started by Kevin Grittnerover 20 years ago13 messageshackers
Jump to latest
#1Kevin Grittner
Kevin.Grittner@wicourts.gov

I've just been bitten by the "backslash in string literals" issue. I
have reviewed the mailing lists and the TODO list. I see that the
direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch. It's probably somewhat naive, but
maybe it can be helpful.

-Kevin

Attachments:

patch.txtapplication/octet-stream; name=patch.txtDownload+108-13
#2Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#1)
Re: Backslashes in string literals

I think we we will be turning on escape_string_warning in 8.2 and allow
standard_conforming_strings to be optionally turned on in that releaes.
I will keep the patch for us in completing that item.

This has been saved for the 8.2 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Kevin Grittner wrote:

I've just been bitten by the "backslash in string literals" issue. I
have reviewed the mailing lists and the TODO list. I see that the
direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch. It's probably somewhat naive, but
maybe it can be helpful.

-Kevin

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3Peter Eisentraut
peter_e@gmx.net
In reply to: Kevin Grittner (#1)
Re: Backslashes in string literals

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch.

There is already a run-time configuration option
standard_conforming_strings which does what you seem to have in mind.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#4Bruce Momjian
bruce@momjian.us
In reply to: Peter Eisentraut (#3)
Re: Backslashes in string literals

Peter Eisentraut wrote:

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch.

There is already a run-time configuration option
standard_conforming_strings which does what you seem to have in mind.

Yes, but right now it is read-only. We didn't have time to allow it to
be set to true in this release. I think it has to wait for 8.2.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Peter Eisentraut (#3)
Re: Backslashes in string literals

On Fri, Dec 9, 2005 at 11:24 am, in message

<200512091824.28760.peter_e@gmx.net>, Peter Eisentraut
<peter_e@gmx.net> wrote:

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a

patch

which supports this as a configure option, using a
-- enable- standard- strings switch.

There is already a run- time configuration option
standard_conforming_strings which does what you seem to have in

mind.

As Bruce has mentioned, this is currently read-only, set to off.

I needed something fast, and I could see a way to do it quickly with a
configure switch, to compile it for standard behavior. Since the
non-standard behavior is in the lexer, I couldn't see any reasonable way
to base it on a runtime switch. I'm curious what is intended here. Can
anyone give a one-paragraph explanation of how this configuration option
will work?

-Kevin

#6Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#5)
Re: Backslashes in string literals

Kevin Grittner wrote:

On Fri, Dec 9, 2005 at 11:24 am, in message

<200512091824.28760.peter_e@gmx.net>, Peter Eisentraut
<peter_e@gmx.net> wrote:

Kevin Grittner wrote:

direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used. I've attached a

patch

which supports this as a configure option, using a
-- enable- standard- strings switch.

There is already a run- time configuration option
standard_conforming_strings which does what you seem to have in

mind.

As Bruce has mentioned, this is currently read-only, set to off.

I needed something fast, and I could see a way to do it quickly with a
configure switch, to compile it for standard behavior. Since the
non-standard behavior is in the lexer, I couldn't see any reasonable way
to base it on a runtime switch. I'm curious what is intended here. Can
anyone give a one-paragraph explanation of how this configuration option
will work?

Have you read our documentation?

http://www.postgresql.org/docs/8.1/static/sql-syntax.html#SQL-SYNTAX-CONSTANTS
http://www.postgresql.org/docs/8.1/static/runtime-config-compatible.html#RUNTIME-CONFIG-COMPATIBLE-VERSION

Between those and the release notes, I don't know what additional
information you want. In the future you will set
standard_conforming_strings to on and backslashes will be treated
literally.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#7Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#6)
Re: Backslashes in string literals

On Sat, Dec 10, 2005 at 8:01 pm, in message

<200512110201.jBB21PE16562@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:

Kevin Grittner wrote:

Since the
non- standard behavior is in the lexer, I couldn't see any

reasonable way

to base it on a runtime switch. I'm curious what is intended here.

Can

anyone give a one- paragraph explanation of how this configuration

option

will work?

Have you read our documentation?
http://www.postgresql.org/docs/8.1/static/sql- syntax.html#SQL-

SYNTAX- CONSTANTS

http://www.postgresql.org/docs/8.1/static/runtime- config-

compatible.html#RUNTI

ME- CONFIG- COMPATIBLE- VERSION

Yes.

Between those and the release notes, I don't know what additional
information you want. In the future you will set
standard_conforming_strings to on and backslashes will be treated
literally.

Perhaps my language was ambiguous. I'm not curious about the intended
behavior from a user perspective, but what I might have missed in the
source code which would have allowed me to write my patch to better
comply with the documentation you cited. Since the problem is in the
lexer, the only way I could see to implement it as a run-time
configuration option, rather than a compile-time option, would be to
duplicate the lexer and maintain two sets of rules in parallel. I
generally try to avoid maintaining two parallel copies of code. I'm
curious whether I missed some other programming approach.

-Kevin

#8Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#7)
Re: Backslashes in string literals

Kevin Grittner wrote:

Between those and the release notes, I don't know what additional
information you want. In the future you will set
standard_conforming_strings to on and backslashes will be treated
literally.

Perhaps my language was ambiguous. I'm not curious about the intended
behavior from a user perspective, but what I might have missed in the
source code which would have allowed me to write my patch to better
comply with the documentation you cited. Since the problem is in the
lexer, the only way I could see to implement it as a run-time
configuration option, rather than a compile-time option, would be to
duplicate the lexer and maintain two sets of rules in parallel. I
generally try to avoid maintaining two parallel copies of code. I'm
curious whether I missed some other programming approach.

Oh, that question. :-) We haven't looked yet at what it will require
to do this in the lexer, but I think what we will eventually do is to
add a more generalized filter to the lexer, and have the actions behave
differntly based on the boolean of whether we are using sql-standard
strings.

If you keep you eye on hackers or the committers messages you will see
when we commit the change for 8.2.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#2)
Re: Backslashes in string literals

We found a bug in the code from my first patch. Since it was a low
frequency, non-destructive type of problem for us, I was able to take my
time and look over the task a little more closely. Attached is a patch
which should come close to implementing the TODO. In particular, it is
now implemented as a configurable option, which can be set in the
postgresql.conf file or at run time. There are some remaining issues:

(1) I couldn't figure out the best way to obtain a value for
standard_conforming_strings in the psql version of the scanner. For our
needs, could just assume it is always on, so I left it that way.
Someone with a better handle on this issue can hopefully finish that
part. Alternatively, if you give me some direction, I might have time
to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but psql
isn't settled down.

(2) There should probably be some tests added to exercise these
options.

(3) I took a quick shot at the documentation, but I'm sure I didn't
cover everything.

(4) I made the changes on REL8_1_STABLE, since we need it now. This
patch is relative to that branch, not the trunk.

I hope this is helpful. If there's anything I should have done
differently, please let me know, so I can try to do better next time.

-Kevin

On Fri, Dec 9, 2005 at 11:23 am, in message

<200512091723.jB9HNjB16785@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:

I think we we will be turning on escape_string_warning in 8.2 and

allow

standard_conforming_strings to be optionally turned on in that

releaes.

Show quoted text

I will keep the patch for us in completing that item.

This has been saved for the 8.2 release:

http://momjian.postgresql.org/cgi- bin/pgpatches_hold

Attachments:

configurable-unified-patch.txtapplication/octet-stream; name=configurable-unified-patch.txtDownload+229-201
#10Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#9)
Re: Backslashes in string literals

On Wed, Jan 25, 2006 at 4:46 pm, in message

<43D7AB6B.EE98.0025.0@wicourts.gov>, "Kevin Grittner"
<Kevin.Grittner@wicourts.gov> wrote:

(2) There should probably be some tests added to exercise these
options.

Attached is a patch to address this one. Note that until psql is
fixed, this test will fail. I manually generated a portion of the text
to match what I expect to get once psql is fixed, so there could be
typos.

-Kevin

Attachments:

test-string-patch.txtapplication/octet-stream; name=test-string-patch.txtDownload+123-8
#11Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#9)
Re: Backslashes in string literals

Kevin Grittner wrote:

We found a bug in the code from my first patch. Since it was a low
frequency, non-destructive type of problem for us, I was able to take my
time and look over the task a little more closely. Attached is a patch
which should come close to implementing the TODO. In particular, it is
now implemented as a configurable option, which can be set in the
postgresql.conf file or at run time. There are some remaining issues:

(1) I couldn't figure out the best way to obtain a value for
standard_conforming_strings in the psql version of the scanner. For our
needs, could just assume it is always on, so I left it that way.
Someone with a better handle on this issue can hopefully finish that
part. Alternatively, if you give me some direction, I might have time
to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but psql
isn't settled down.

Sounds like you made great progress!

The proper way to do (1) is to call libpq's pqSaveParameterStatus() from
psql. Take a look for psql's session_username(). It is called
everytime the prompt is printed if the username is required. One great
feature of using pqSaveParameterStatus() is that it reads server packets
and keeps the tracked value updated for you without query overhead.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#12Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Bruce Momjian (#11)
Re: Backslashes in string literals

On Wed, Feb 1, 2006 at 10:50 am, in message

<200602011650.k11GoiU23147@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:

Kevin Grittner wrote:

We found a bug in the code from my first patch. Since it was a low
frequency, non- destructive type of problem for us, I was able to

take my

time and look over the task a little more closely. Attached is a

patch

which should come close to implementing the TODO. In particular, it

is

now implemented as a configurable option, which can be set in the
postgresql.conf file or at run time. There are some remaining

issues:

(1) I couldn't figure out the best way to obtain a value for
standard_conforming_strings in the psql version of the scanner. For

our

needs, could just assume it is always on, so I left it that way.
Someone with a better handle on this issue can hopefully finish

that

part. Alternatively, if you give me some direction, I might have

time

to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but

psql

isn't settled down.

Sounds like you made great progress!

Thanks. It was actually pretty easy once I took the time to learn
flex. I'd kinda winged it in my emergency compile-time version. I'm
pretty sure that what I've done works; my biggest concern is over what
I've missed. For example, I was using pg_dump and pg_restore today and
realized that these, and other applications, likely need some kind of
work to support the feature.

The proper way to do (1) is to call libpq's pqSaveParameterStatus()

from

psql. Take a look for psql's session_username(). It is called
everytime the prompt is printed if the username is required. One

great

feature of using pqSaveParameterStatus() is that it reads server

packets

and keeps the tracked value updated for you without query overhead.

I'll take a look at it. If I feel confident that I "get it", I'll do
the work and post another patch. Would you prefer that I resend the
whole works, or just the delta?

Also, since we're doing this out of need to fix the issue on our
production system, I'm compelled to work on the stable branch. Is it OK
to post patches from the tip of that branch, or should I really check
out the trunk (HEAD), too, and replicate it there for my patch posts?

-Kevin

#13Bruce Momjian
bruce@momjian.us
In reply to: Kevin Grittner (#12)
Re: Backslashes in string literals

Kevin Grittner wrote:

to generalize it. As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but

psql

isn't settled down.

Sounds like you made great progress!

Thanks. It was actually pretty easy once I took the time to learn
flex. I'd kinda winged it in my emergency compile-time version. I'm
pretty sure that what I've done works; my biggest concern is over what
I've missed. For example, I was using pg_dump and pg_restore today and
realized that these, and other applications, likely need some kind of
work to support the feature.

Right, I will look over the rest of the code and fix any places you
missed. I think most of it centers around ESCAPE_STRING_SYNTAX usage.

The proper way to do (1) is to call libpq's pqSaveParameterStatus()

from

psql. Take a look for psql's session_username(). It is called
everytime the prompt is printed if the username is required. One

great

feature of using pqSaveParameterStatus() is that it reads server

packets

and keeps the tracked value updated for you without query overhead.

I'll take a look at it. If I feel confident that I "get it", I'll do
the work and post another patch. Would you prefer that I resend the
whole works, or just the delta?

I would like the whole patch rather than just an additional one. Again,
I will review it and polish whatever you don't do.

Also, since we're doing this out of need to fix the issue on our
production system, I'm compelled to work on the stable branch. Is it OK
to post patches from the tip of that branch, or should I really check
out the trunk (HEAD), too, and replicate it there for my patch posts?

The branch is fine. I will merge any changes in to HEAD.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073