bytea char escaping

Started by Ivaralmost 23 years ago10 messagesgeneral
Jump to latest
#1Ivar
ivar@lumisoft.ee

Hi,

What chars must be escaped in string and how exactly?

LF -> \\n
CR -> \\r

#2Stephen Robert Norris
srn@commsecure.com.au
In reply to: Ivar (#1)
Re: bytea char escaping

On Tue, 2003-06-24 at 22:42, Ivar wrote:

Hi,

What chars must be escaped in string and how exactly?

LF -> \\n
CR -> \\r

From memory, you have to escape \ -> \\, ' -> \' and I think NULL to

\\000. I have a feeling that you need to escape all the non-printable
characters to \xxx escape codes, but I may be wrong there.

I've just looked at my code to do all this, and I'm ashamed to say I
can't really work out what it escapes. Must document better.

Stephen
--
Stephen Robert Norris <srn@commsecure.com.au>
CommSecure Australia Pty Ltd

#3Joe Conway
mail@joeconway.com
In reply to: Stephen Robert Norris (#2)
Re: bytea char escaping

Stephen Robert Norris wrote:

On Tue, 2003-06-24 at 22:42, Ivar wrote:

What chars must be escaped in string and how exactly?
From memory, you have to escape \ -> \\, ' -> \' and I think NULL to

\\000. I have a feeling that you need to escape all the non-printable
characters to \xxx escape codes, but I may be wrong there.

I've just looked at my code to do all this, and I'm ashamed to say I
can't really work out what it escapes. Must document better.

See:
http://www.postgresql.org/docs/view.php?version=7.3&amp;idoc=0&amp;file=datatype-binary.html

HTH,

Joe

#4Stephen Robert Norris
srn@commsecure.com.au
In reply to: Joe Conway (#3)
Re: bytea char escaping

On Wed, 2003-06-25 at 12:24, Joe Conway wrote:

Stephen Robert Norris wrote:

On Tue, 2003-06-24 at 22:42, Ivar wrote:

What chars must be escaped in string and how exactly?
From memory, you have to escape \ -> \\, ' -> \' and I think NULL to

\\000. I have a feeling that you need to escape all the non-printable
characters to \xxx escape codes, but I may be wrong there.

I've just looked at my code to do all this, and I'm ashamed to say I
can't really work out what it escapes. Must document better.

See:
http://www.postgresql.org/docs/view.php?version=7.3&amp;idoc=0&amp;file=datatype-binary.html

HTH,

Joe

Ah, yes. I remember that page. The funny thing about it is that it
doesn't actually say which strings _have_ to be escaped, just gives some
examples...

I always read the Table 5-8 example to indicate that 0-31, 127-255 and \
have to be escaped, but it's not stated anywhere...

Stephen

--
Stephen Robert Norris <srn@commsecure.com.au>
CommSecure Australia Pty Ltd

#5Joe Conway
mail@joeconway.com
In reply to: Stephen Robert Norris (#4)
Re: bytea char escaping

Stephen Robert Norris wrote:

Ah, yes. I remember that page. The funny thing about it is that it
doesn't actually say which strings _have_ to be escaped, just gives some
examples...

"When entering bytea values, octets of certain values must be escaped
(but all octet values may be escaped) when used as part of a string
literal in an SQL statement. In general, to escape an octet, it is
converted into the three-digit octal number equivalent of its decimal
octet value, and preceded by two backslashes. Some octet values have
alternate escape sequences, as shown in Table 5-7."

I guess it could be more clear, but this paragraph refers to values
which must be escaped on input, and table 5-7 shows them. So they are
the examples of the values that must be escaped ;-)

I always read the Table 5-8 example to indicate that 0-31, 127-255 and \
have to be escaped, but it's not stated anywhere...

"Bytea output octets are also escaped. In general, each "non-printable"
octet decimal value is converted into its equivalent three digit octal
value, and preceded by one backslash. Most "printable" octets are
represented by their standard representation in the client character
set. The octet with decimal value 92 (backslash) has a special alternate
output representation. Details are in Table 5-8."

This one I think is pretty clear. It's discussing output, i.e. what the
client can expect to see coming from the server. So it's not saying you
need to escape those values, but it is saying that they will be sent to
you escaped.

But improved wording for the docs is always a welcome patch!

Joe

#6Stephen Robert Norris
srn@commsecure.com.au
In reply to: Joe Conway (#5)
Re: bytea char escaping

On Wed, 2003-06-25 at 15:25, Joe Conway wrote:

Stephen Robert Norris wrote:

Ah, yes. I remember that page. The funny thing about it is that it
doesn't actually say which strings _have_ to be escaped, just gives some
examples...

"When entering bytea values, octets of certain values must be escaped
(but all octet values may be escaped) when used as part of a string
literal in an SQL statement. In general, to escape an octet, it is
converted into the three-digit octal number equivalent of its decimal
octet value, and preceded by two backslashes. Some octet values have
alternate escape sequences, as shown in Table 5-7."

I guess it could be more clear, but this paragraph refers to values
which must be escaped on input, and table 5-7 shows them. So they are
the examples of the values that must be escaped ;-)

Well, no. What it says is that certain values must be escaped (but
doesn't say which ones). Then it says there are alternate escape
sequences for some values, which it lists.

It doesn't say "The following table contains the characters which must
be escaped:", which would be much clearer (and actually useful).

Stephen
--
Stephen Robert Norris <srn@commsecure.com.au>
CommSecure Australia Pty Ltd

#7Joe Conway
mail@joeconway.com
In reply to: Stephen Robert Norris (#6)
Re: bytea char escaping

Stephen Robert Norris wrote:

Well, no. What it says is that certain values must be escaped (but
doesn't say which ones). Then it says there are alternate escape
sequences for some values, which it lists.

It doesn't say "The following table contains the characters which must
be escaped:", which would be much clearer (and actually useful).

Attached documentation patch updates the wording for bytea input
escaping, per complaint by Stephen Norris above.

Please apply.

Joe

Attachments:

datatype.sgml.difftext/html; name=datatype.sgml.diffDownload+5-5
#8Ivar
ivar@lumisoft.ee
In reply to: Ivar (#1)
Re: bytea char escaping

Yes this is much clearer.

I got this part working, but some some chars is handled wrong.
I'm using 7.3.1 in windows.

eg. alt 152 is handeld wrong.

there is some encoding problems, any ideas ?
"Joe Conway" <mail@joeconway.com> wrote in message
news:3EF93DF4.3030507@joeconway.com...
Stephen Robert Norris wrote:

Well, no. What it says is that certain values must be escaped (but
doesn't say which ones). Then it says there are alternate escape
sequences for some values, which it lists.

It doesn't say "The following table contains the characters which must
be escaped:", which would be much clearer (and actually useful).

Attached documentation patch updates the wording for bytea input
escaping, per complaint by Stephen Norris above.

Please apply.

Joe

----------------------------------------------------------------------------
----

Index: doc/src/sgml/datatype.sgml
=================================================================== RCS
file: /opt/src/cvs/pgsql-server/doc/src/sgml/datatype.sgml,v retrieving
revision 1.119 diff -c -r1.119 datatype.sgml *** doc/src/sgml/datatype.sgml
25 Jun 2003 03:50:52 -0000 1.119 --- doc/src/sgml/datatype.sgml 25 Jun 2003
06:19:28 -0000 *************** *** 1062,1069 **** literal in an SQL
statement. In general, to escape an octet, it is converted into the
three-digit octal number equivalent of its decimal octet value, and preceded
by two ! backslashes. Some octet values have alternate escape sequences, as
! shown in . --- 1062,1070 ---- literal in an SQL statement. In general, to
escape an octet, it is converted into the three-digit octal number
equivalent of its decimal octet value, and preceded by two ! backslashes.
contains the ! characters which must be escaped, and gives the alternate
escape ! sequences where applicable.

----------------------------------------------------------------------------
----

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

#9Ivar
ivar@lumisoft.ee
In reply to: Ivar (#1)
Re: bytea char escaping

If I encode bytes as utf8 I get right result.

I guess that uncicode odbc driver returns bytes as utf8, is it so ?

"Ivar" <ivar@lumisoft.ee> wrote in message
news:bdbhf9$v6n$1@main.gmane.org...

Yes this is much clearer.

I got this part working, but some some chars is handled wrong.
I'm using 7.3.1 in windows.

eg. alt 152 is handeld wrong.

there is some encoding problems, any ideas ?
"Joe Conway" <mail@joeconway.com> wrote in message
news:3EF93DF4.3030507@joeconway.com...
Stephen Robert Norris wrote:

Well, no. What it says is that certain values must be escaped (but
doesn't say which ones). Then it says there are alternate escape
sequences for some values, which it lists.

It doesn't say "The following table contains the characters which must
be escaped:", which would be much clearer (and actually useful).

Attached documentation patch updates the wording for bytea input
escaping, per complaint by Stephen Norris above.

Please apply.

Joe

--------------------------------------------------------------------------

--

----

Index: doc/src/sgml/datatype.sgml
=================================================================== RCS
file: /opt/src/cvs/pgsql-server/doc/src/sgml/datatype.sgml,v retrieving
revision 1.119 diff -c -r1.119 datatype.sgml ***

doc/src/sgml/datatype.sgml

25 Jun 2003 03:50:52 -0000 1.119 --- doc/src/sgml/datatype.sgml 25 Jun

2003

06:19:28 -0000 *************** *** 1062,1069 **** literal in an SQL
statement. In general, to escape an octet, it is converted into the
three-digit octal number equivalent of its decimal octet value, and

preceded

by two ! backslashes. Some octet values have alternate escape sequences,

as

! shown in . --- 1062,1070 ---- literal in an SQL statement. In general,

to

escape an octet, it is converted into the three-digit octal number
equivalent of its decimal octet value, and preceded by two ! backslashes.
contains the ! characters which must be escaped, and gives the alternate
escape ! sequences where applicable.

--------------------------------------------------------------------------

--

Show quoted text

----

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

#10Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#7)
Re: [PATCHES] bytea char escaping

Patch applied. Thanks.

---------------------------------------------------------------------------

Joe Conway wrote:

Stephen Robert Norris wrote:

Well, no. What it says is that certain values must be escaped (but
doesn't say which ones). Then it says there are alternate escape
sequences for some values, which it lists.

It doesn't say "The following table contains the characters which must
be escaped:", which would be much clearer (and actually useful).

Attached documentation patch updates the wording for bytea input
escaping, per complaint by Stephen Norris above.

Please apply.

Joe

[ text/html is unsupported, treating like TEXT/PLAIN ]

Index: doc/src/sgml/datatype.sgml
===================================================================
RCS file: /opt/src/cvs/pgsql-server/doc/src/sgml/datatype.sgml,v
retrieving revision 1.119
diff -c -r1.119 datatype.sgml
*** doc/src/sgml/datatype.sgml	25 Jun 2003 03:50:52 -0000	1.119
--- doc/src/sgml/datatype.sgml	25 Jun 2003 06:19:28 -0000
***************
*** 1062,1069 ****
literal in an <acronym>SQL</acronym> statement. In general, to
escape an octet, it is converted into the three-digit octal number
equivalent of its decimal octet value, and preceded by two
!     backslashes. Some octet values have alternate escape sequences, as
!     shown in <xref linkend="datatype-binary-sqlesc">.
</para>
<table id="datatype-binary-sqlesc">
--- 1062,1070 ----
literal in an <acronym>SQL</acronym> statement. In general, to
escape an octet, it is converted into the three-digit octal number
equivalent of its decimal octet value, and preceded by two
!     backslashes. <xref linkend="datatype-binary-sqlesc"> contains the
!     characters which must be escaped, and gives the alternate escape
!     sequences where applicable.
</para>

<table id="datatype-binary-sqlesc">

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073