A few questions about carriage returns (\r)
I have a database that needs to work with TEXT input generated from a
heterogeneous environment (UNIX, Win, Mac). As such, I have noticed
that equality comparisons of supposedly identical TEXT is failing due to
the different line terminators embedded in some TEXT fields.
So, as I understand it (please correct me if I'm wrong), UNIX uses a
"newline" (or \n), Mac uses "carriage return" (or \r) and Win/DOS uses
\r\n.
Looking into this issue has led me to a number of questions below...
Thanks for any help!
-Jon
=======================
1) Does anyone know why the "id" column is not visible for the final
select statement? I guess a lone \r literally means to go to the
farthest position to the left... but it seems like a bug that it moves
past its column position. Bug in psql?
test=> create table foo ( id serial, bar TEXT );
NOTICE: CREATE TABLE will create implicit sequence "foo_id_seq" for
serial column "foo.id"
CREATE TABLE
test=> insert into foo (bar) VALUES ('Line 1\nLine 2');
INSERT 0 1
test=> insert into foo (bar) VALUES ('Line 1\r\nLine 2');
INSERT 0 1
test=> insert into foo (bar) VALUES ('Line 1\rLine 2');
INSERT 0 1
test=> select id,textcol from foo;
id | textcol
----+---------
1 | Line 1
Line 2
2 | Line 1
Line 2
Line 2Line 1
(3 rows)
=======================
2) Is there a way to *view* the \n and \rs embedded in a TEXT field
using psql?
=======================
3) Is there a string function that is capable of replacing \r\n with \n?
More generally, is there a string function capable of regular
expression replace? (eg: perl and other languages have "=~
s/\r\n/\n/"). I imagine an SQL function that would work like this
fictional function:
SELECT re_replace('\r\n' IN bar USING '\n') FROM foo;
I see that "substring" works with REs, but I cannot for the life of me
figure out how to use substring equivalently to my fictional
re_replace() function above.
Extra information that may be useful:
test=> select version();
version
-------------------------------------------------------------------------------------------------------------------------
PostgreSQL 8.1.4 on i386-redhat-linux-gnu, compiled by GCC
i386-redhat-linux-gcc (GCC) 4.1.0 20060304 (Red Hat 4.1.0-3)
(1 row)
--
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
Jon Lapham <lapham@jandr.org> Rio de Janeiro, Brasil
Personal: http://www.jandr.org/
***-*--*----*-------*------------*--------------------*---------------
On Thu, Jun 15, 2006 at 11:51:21AM -0300, Jon Lapham wrote:
So, as I understand it (please correct me if I'm wrong), UNIX uses a
"newline" (or \n), Mac uses "carriage return" (or \r) and Win/DOS uses
\r\n.
Correct.
1) Does anyone know why the "id" column is not visible for the final
select statement? I guess a lone \r literally means to go to the
farthest position to the left... but it seems like a bug that it moves
past its column position. Bug in psql?
Well, your terminal moving the cursor left when it sees a \r, psql
isn't doing anything (which is the problem).
CVS HEAD contains patches that display the output more clearly.
=======================
2) Is there a way to *view* the \n and \rs embedded in a TEXT field
using psql?
You could use replace to make them visible.
=======================
3) Is there a string function that is capable of replacing \r\n with \n?
Yes, replace.
# select replace('aac','a','b');
replace
---------
bbc
(1 row)
So replace(str, '\r', '\\r') should work
More generally, is there a string function capable of regular
expression replace? (eg: perl and other languages have "=~
s/\r\n/\n/"). I imagine an SQL function that would work like this
fictional function:
I beleive there is a regexp_replace. In psql, if you type \df you get a
list of all defined functions. The docs have info too.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout wrote:
# select replace('aac','a','b');
replace
---------
bbc
(1 row)So replace(str, '\r', '\\r') should work
Perfect, thanks.
I guess I'm blind, I didn't see this in the docs. :)
-Jon
--
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
Jon Lapham <lapham@jandr.org> Rio de Janeiro, Brasil
Personal: http://www.jandr.org/
***-*--*----*-------*------------*--------------------*---------------
Martijn van Oosterhout wrote:
I beleive there is a regexp_replace. In psql, if you type \df you get a
list of all defined functions. The docs have info too.
The function is confusingly not mentioned in the documentation under
"String Functions and Operators", but only under "Pattern Matching". I
would suggest at least adding a reference from the former to the latter.
If someone with access edits that page, perhaps they can at the same
time move the note "a. The to_ascii function supports conversion from
LATIN1, LATIN2, LATIN9, and WIN1250 encodings only" into the description
field for the function, or at least add an indication that there is a note.
/Nis
Nis Jorgensen wrote:
Martijn van Oosterhout wrote:
I beleive there is a regexp_replace. In psql, if you type \df you get a
list of all defined functions. The docs have info too.The function is confusingly not mentioned in the documentation under
"String Functions and Operators", but only under "Pattern Matching". I
would suggest at least adding a reference from the former to the latter.
Docs are updated in CVS HEAD and 8.1.X branches.
If someone with access edits that page, perhaps they can at the same
time move the note "a. The to_ascii function supports conversion from
LATIN1, LATIN2, LATIN9, and WIN1250 encodings only" into the description
field for the function, or at least add an indication that there is a note.
OK, footnote removed and text added to main description:
Convert string to ASCII from another encoding (only supports conversion
from LATIN1, LATIN2, LATIN9, and WIN1250 encodings)
--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +