New user: Windows, Postgresql, Python
Hi,
I'm just starting to look at Postgresql. My platform (for better or
worse) is Windows, and I'm quite interested in the pl/python support.
However, when I run the binary installer, it is not offered to me as
an option (it's there, but greyed out). The plpython.dll file is
installed, however.
When I check, it looks like plpython.dll is linked against Python
2.3. I have Python 2.4 installed on my PC, and I don't really want to
downgrade.
I suppose my first (lazy) question is, is there a Python 2.4
compatible plpython.dll available anywhere? Alternatively, is there a
way I can build one for myself? I'm happy enough doing my own build
(I have mingw and msys available), but I'd rather not build the whole
of postgresql if possible, just for the sake of one DLL....
Thanks in advance,
Paul.
--
"Bother," said the Borg, "We've assimilated Pooh."
Paul Moore <pf_moore@yahoo.co.uk> writes:
I suppose my first (lazy) question is, is there a Python 2.4
compatible plpython.dll available anywhere? Alternatively, is there a
way I can build one for myself? I'm happy enough doing my own build
(I have mingw and msys available), but I'd rather not build the whole
of postgresql if possible, just for the sake of one DLL....
Actually, I had a go and was surprised to find that the build was
pretty simple. Once I'd got a build, copying the plpython DLL (which
is built with a different name, AFAICT) over the one installed by the
binary installer seems to work fine. Is that OK, or could I hit
problems later?
Paul.
--
Progress isn't made by early risers. It's made by lazy men trying to
find easier ways to do something. -- Robert Heinlein
Hi,
I'm just starting to look at Postgresql. My platform (for better or
worse) is Windows, and I'm quite interested in the pl/python support.
However, when I run the binary installer, it is not offered
to me as an option (it's there, but greyed out). The
plpython.dll file is installed, however.When I check, it looks like plpython.dll is linked against
Python 2.3. I have Python 2.4 installed on my PC, and I don't
really want to downgrade.I suppose my first (lazy) question is, is there a Python 2.4
compatible plpython.dll available anywhere? Alternatively, is
there a way I can build one for myself? I'm happy enough
doing my own build (I have mingw and msys available), but I'd
rather not build the whole of postgresql if possible, just
for the sake of one DLL....
Not that I know of. IFF the libraries export the same entrypoints
without changing things, you could try just copying "python24.dll" to
"python23.dll". I don't know how the Python guys are with binary
compatibility, though. Might be worth a shot.
On a different note, can't you have both python 2.3 *and* 2.4 on the
asme system? Considering they put the version number in the filename, it
seems this should be possible?
//Magnus
Import Notes
Resolved by subject fallback
mha@sollentuna.net ("Magnus Hagander") writes:
I suppose my first (lazy) question is, is there a Python 2.4
compatible plpython.dll available anywhere? Alternatively, is
there a way I can build one for myself? I'm happy enough
doing my own build (I have mingw and msys available), but I'd
rather not build the whole of postgresql if possible, just
for the sake of one DLL....Not that I know of. IFF the libraries export the same entrypoints
without changing things, you could try just copying "python24.dll" to
"python23.dll". I don't know how the Python guys are with binary
compatibility, though. Might be worth a shot.
As per my earlier posting, I actually found that building postgresql
wasn't at all hard. Once I'd built with Python 2.4 support, I had a
compatible plpython.dll I could just copy in.
I'm not sure renaming the Python DLL would have worked - it's
definitely frowned on...
On a different note, can't you have both python 2.3 *and* 2.4 on the
asme system? Considering they put the version number in the filename, it
seems this should be possible?
I could, but I try to avoid this, as it involves double installs of
any extensions I want to use, or incompatible environments. More
laziness on my part, really :-)
Thanks for the suggestions,
Paul.
PS Thanks to the developers who made building postgresql on Windows
such an easy job! I was very impressed - I genuinely didn't think
that building such a large piece of software would be so hassle-free.
--
Never keep up with the Joneses. Drag them down to your level. --
Quentin Crisp
On Tue, Mar 15, 2005 at 07:05:22PM +0000, Paul Moore wrote:
As per my earlier posting, I actually found that building postgresql
wasn't at all hard. Once I'd built with Python 2.4 support, I had a
compatible plpython.dll I could just copy in.
Pardon the interruption, but do you have a PostgreSQL server with
PL/Python running on Windows? Have you been following the "plpython
function problem workaround" thread?
http://archives.postgresql.org/pgsql-general/2005-03/msg00599.php
We (the thread participants) could use somebody with a Windows
server to do some testing. Specifically, we're wondering if Python
on Windows requires embedded Python code to have CRLF (\r\n) as a
line ending, or if it requires (or at least permits) LF (\n) only.
If you're able to help, could you could post the results of the
following?
CREATE FUNCTION pytest_lf() RETURNS integer AS
'x = 1\nreturn x\n'
LANGUAGE plpythonu;
CREATE FUNCTION pytest_crlf() RETURNS integer AS
'x = 1\r\nreturn x\r\n'
LANGUAGE plpythonu;
SELECT pytest_lf();
SELECT pytest_crlf();
With PostgreSQL 8.0.1, Python 2.4.1c1, and Solaris 9, I get this:
test=# SELECT pytest_lf();
pytest_lf
-----------
1
(1 row)
test=# SELECT pytest_crlf();
ERROR: plpython: could not compile function "pytest_crlf"
DETAIL: exceptions.SyntaxError: invalid syntax (line 2)
If you have the ability to compile standalone C programs with
embedded Python, we'd also be interested in seeing what happens if
you run the programs in the following messages:
http://archives.postgresql.org/pgsql-general/2005-01/msg00876.php
http://archives.postgresql.org/pgsql-general/2005-03/msg00630.php
Any test results or comments you can provide would be appreciated.
Thanks.
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
mike@fuhr.org (Michael Fuhr) writes:
We (the thread participants) could use somebody with a Windows
server to do some testing.
Glad to help... This is with postgresql 8.0.1, Python 2.4.
Specifically, we're wondering if Python on Windows requires embedded
Python code to have CRLF (\r\n) as a line ending, or if it requires
(or at least permits) LF (\n) only. If you're able to help, could
you could post the results of the following?CREATE FUNCTION pytest_lf() RETURNS integer AS
'x = 1\nreturn x\n'
LANGUAGE plpythonu;CREATE FUNCTION pytest_crlf() RETURNS integer AS
'x = 1\r\nreturn x\r\n'
LANGUAGE plpythonu;SELECT pytest_lf();
SELECT pytest_crlf();With PostgreSQL 8.0.1, Python 2.4.1c1, and Solaris 9, I get this:
test=# SELECT pytest_lf();
pytest_lf
-----------
1
(1 row)test=# SELECT pytest_crlf();
ERROR: plpython: could not compile function "pytest_crlf"
DETAIL: exceptions.SyntaxError: invalid syntax (line 2)
I get exactly the same results.
If you have the ability to compile standalone C programs with
embedded Python, we'd also be interested in seeing what happens if
you run the programs in the following messages:http://archives.postgresql.org/pgsql-general/2005-01/msg00876.php
I get:
test1
What hath
Guido wrought?
http://archives.postgresql.org/pgsql-general/2005-03/msg00630.php
I get:
test2
Initialized.
Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)]
running:
print 1
print 2
1
2
end
running:
print 1
print 2
File "<string>", line 1
print 1
^
SyntaxError: invalid syntax
end
Finalized.
I don't know if this helps? It seems reasonable to me - as far as
Python C code is concerned, code strings should be \n-separated, just
like in Unix. The only place CRLF is applicable is in code read from
files, where the C runtime converts it to \n-delimited before the
Python APIs see it (as far as I understand it, which isn't very
far...)
The long and short of it is that I believe you just use \n to delimit
lines on Windows, just like anywhere else.
Regards,
Paul.
--
SCSI is not magic. There are fundamental technical reasons why it is
necessary to sacrifice a young goat to your SCSI chain now and then.
-- John Woods
On Tue, Mar 15, 2005 at 10:46:09PM +0000, Paul Moore wrote:
The long and short of it is that I believe you just use \n to delimit
lines on Windows, just like anywhere else.
Many thanks -- your test results contain the info we've been seeking.
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
Michael Fuhr wrote:
On Tue, Mar 15, 2005 at 10:46:09PM +0000, Paul Moore wrote:
The long and short of it is that I believe you just use \n to delimit
lines on Windows, just like anywhere else.Many thanks -- your test results contain the info we've been seeking.
Thanks a lot Paul.
Micheal, you were right.
It seems python documentation is plain wrong, or I'm not able to
read it at all:
http://docs.python.org/ref/physical.html
"A physical line ends in whatever the current platform's convention is for
terminating lines. On Unix, this is the ASCII LF (linefeed) character. On
Windows, it is the ASCII sequence CR LF (return followed by linefeed). On
Macintosh, it is the ASCII CR (return) character."
This is the language _reference_ manual, btw. I'm very surprised to hear
python on windows is so broken.
Anyway, that makes life simpler for us. plpython programs are \n separated,
no matter what platform the server runs on. Client applications just need
to conply, which is what I suggested some time ago. I'm glad to hear
there's nothing to change on the server side.
.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo@ESI.it
On Wed, Mar 16, 2005 at 01:46:23PM +0100, Marco Colombo wrote:
It seems python documentation is plain wrong, or I'm not able to
read it at all:http://docs.python.org/ref/physical.html
"A physical line ends in whatever the current platform's convention is for
terminating lines. On Unix, this is the ASCII LF (linefeed) character. On
Windows, it is the ASCII sequence CR LF (return followed by linefeed). On
Macintosh, it is the ASCII CR (return) character."
Perhaps the Python documentation could use some clarification about
when the platform's convention is required and when it isn't.
The "Embedding Python" documentation shows embedded code with lines
ending in \n and without saying anything about requiring the
platform's convention:
http://docs.python.org/ext/high-level-embedding.html
This is the language _reference_ manual, btw. I'm very surprised to hear
python on windows is so broken.Anyway, that makes life simpler for us. plpython programs are \n separated,
no matter what platform the server runs on.
That the behavior makes life simpler is an argument against it being
broken (although it would be even less broken if it were more
flexible about what line endings it allows). A detailed response
would be getting off-topic for PostgreSQL, but I'll stand by what
I said earlier: I would find it bizarre if embedded Python code had
to use different line endings on different platforms. That would
mean the programmer couldn't simply do this:
PyRun_SimpleString("x = 1\n"
"print x\n");
Instead, the programmer would have to do a compile-time or run-time
check and build the string in a platform-dependent manner. What
problem would the language be solving by requiring that?
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
On Wed, 16 Mar 2005, Michael Fuhr wrote:
On Wed, Mar 16, 2005 at 01:46:23PM +0100, Marco Colombo wrote:
It seems python documentation is plain wrong, or I'm not able to
read it at all:http://docs.python.org/ref/physical.html
"A physical line ends in whatever the current platform's convention is for
terminating lines. On Unix, this is the ASCII LF (linefeed) character. On
Windows, it is the ASCII sequence CR LF (return followed by linefeed). On
Macintosh, it is the ASCII CR (return) character."Perhaps the Python documentation could use some clarification about
when the platform's convention is required and when it isn't.The "Embedding Python" documentation shows embedded code with lines
ending in \n and without saying anything about requiring the
platform's convention:http://docs.python.org/ext/high-level-embedding.html
This is the language _reference_ manual, btw. I'm very surprised to hear
python on windows is so broken.Anyway, that makes life simpler for us. plpython programs are \n separated,
no matter what platform the server runs on.That the behavior makes life simpler is an argument against it being
broken (although it would be even less broken if it were more
flexible about what line endings it allows).
broken == 'not conforming to the specifications or the documentation'
The fact it helps us is just a side effect.
A detailed response
would be getting off-topic for PostgreSQL, but I'll stand by what
I said earlier: I would find it bizarre if embedded Python code had
to use different line endings on different platforms. That would
mean the programmer couldn't simply do this:PyRun_SimpleString("x = 1\n"
"print x\n");Instead, the programmer would have to do a compile-time or run-time
check and build the string in a platform-dependent manner. What
problem would the language be solving by requiring that?
This one:
aprogram = "x = 1\nprint x\n";
printf(aprogram);
PyRun_SimpleString(aprogram);
See? THIS program requires compile-time or run-time checks. You
can't run it on Windows, or Mac: it'll write garbage to the screen
(something that looks like garbage, that is).
Make it more general:
aprogram = get_program_from_somewhere();
PyRun_SimpleString(aprogram);
write_program_to_somefile_possibly_stdout(aprogram);
What if get_program_from_somewhere() reads user input? On Windows
lines will be \r\n separated. Now, should this program make
platform checks? Why not simply read a file (or stdin) in text
mode, and pass the result to PyRun_SimpleString()? The same applies
to output, of course.
Now something strikes me... in his tests, Paul tried my program and
the output looks identical to Linux. Now... I was expecting
program1 (the one with just \n) do display badly under Windows.
Am I missing something? Does C runtime support in Windows convert
\n into \r\n automatically in printf()? If so, I'm on the wrong track.
It may do the same with scanf() and other stdio functions.
I must say I wasn't expecting my program to run just fine, with all
those \n I used in it. Staring from
printf("> Initialized.\n");
Paul can you please tell me which compiler you used under Windows
to complile my program and if you used some weird compiling options? TIA.
.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo@ESI.it
On Wed, Mar 16, 2005 at 04:17:51PM +0100, Marco Colombo wrote:
aprogram = "x = 1\nprint x\n";
printf(aprogram);
PyRun_SimpleString(aprogram);See? THIS program requires compile-time or run-time checks. You
can't run it on Windows, or Mac: it'll write garbage to the screen
(something that looks like garbage, that is).
Are you sure about that? It's been forever since I programmed in
a Microsoft environment, but as I recall, I/O streams opened in
"text mode" do automatic translations between \n and \r\n.
"Also, in text mode, carriage return-linefeed combinations are
translated into single linefeeds on input, and linefeed characters
are translated to carriage return-linefeed combinations on output."
I didn't look up Mac behavior but I'd be surprised if it didn't
offer the same "text mode" and "binary mode" behaviors. It's
annoying that these platforms use different line endings, but at
least their implementations of standard C libraries offer a way to
hide that difference from the programmer.
Now something strikes me... in his tests, Paul tried my program and
the output looks identical to Linux. Now... I was expecting
program1 (the one with just \n) do display badly under Windows.
Am I missing something? Does C runtime support in Windows convert
\n into \r\n automatically in printf()? If so, I'm on the wrong track.
It may do the same with scanf() and other stdio functions.
I think that's exactly what happens with I/O streams in "text mode."
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
On Wed, 16 Mar 2005, Michael Fuhr wrote:
On Wed, Mar 16, 2005 at 04:17:51PM +0100, Marco Colombo wrote:
aprogram = "x = 1\nprint x\n";
printf(aprogram);
PyRun_SimpleString(aprogram);See? THIS program requires compile-time or run-time checks. You
can't run it on Windows, or Mac: it'll write garbage to the screen
(something that looks like garbage, that is).Are you sure about that? It's been forever since I programmed in
a Microsoft environment, but as I recall, I/O streams opened in
"text mode" do automatic translations between \n and \r\n.
No I wasn't sure and I actually was wrong. I've never programmed under
Windows. I've just learned something.
Apparently, as far as Python is concerned, the platform presents \n
at C level, so it makes sense for PyRun_SimpleString() to expect \n
as line terminator. Still I don't understand when the lexxer would
use \r\n as pysical line ending on Windows, but I can live with it. :-)
It seems that any client application under Windows is likely to use
only \n-delimited text, as long as it uses stdio functions and text
mode. Problems arise when it gets text from some other source. But since
at C level text is expected to be \n-delimited, the application should
take care of the conversion as soon as it receives the data.
I think that if we want to be conservative, any input that is supposed
to be treated (actively) as text by the server, should be \n-delimited.
That includes any function source.
I'm against to any on-the-fly conversion, now.
I don't like the idea of PostgreSQL accepting input in one form
(\r\n) and providing output in a different form (\n). Also think of
a function definition with mixed \r\n and \n lines: we'd have no way
to reconstruct the original input. I think we should just state that
text used for function definitions is \n-delimited. Some languages may
accept \r\n as well, but that's undocumented side effect, and bad practice.
Now that I learned that C programs on Windows are expected to handle
\n-delimited text, I can't think of any reason why an application should
send \r\n-delimited text via libpq as a function definition, unless
the programmer forgot to perform the "standard" \r\n to \n conversion
somewhere.
.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo@ESI.it
[I've changed the Subject back to the thread that started this
discussion.]
On Wed, Mar 16, 2005 at 05:52:02PM +0100, Marco Colombo wrote:
I'm against to any on-the-fly conversion, now.
I don't like the idea of PostgreSQL accepting input in one form
(\r\n) and providing output in a different form (\n). Also think of
a function definition with mixed \r\n and \n lines: we'd have no way
to reconstruct the original input.
Yeah, that's a reasonable argument against modifying the function
source code before storing it in pg_proc. But I expect this problem
will come up again, and some people might not care about being able
to reconstruct the original input if it's just a matter of stripped
carriage returns, especially if the function logic doesn't use
literal carriage return characters that would be missed. For those
people, the validator hack might be an acceptable way to deal with
a client interface that inserts carriage returns that the programmer
didn't intend anyway. Not necessarily as part of the core PostgreSQL
code or even distributed with PostgreSQL, but as something they
could install if they wanted to.
I think we should just state that text used for function definitions
is \n-delimited. Some languages may accept \r\n as well, but that's
undocumented side effect, and bad practice.
Whether it's an "undocumented side effect" depends on the language,
and whether it's bad practice is a matter of opinion. In any case,
that's the language's concern and not something PostgreSQL should
judge or enforce. PostgreSQL shouldn't have to know or care about a
procedural language's syntax -- a function's source code should be an
opaque object that PostgreSQL stores and passes to the language's
handler without caring about its contents. Syntax enforcement should
be in the language's validator or handler according to the language's
own rules.
Speaking of code munging and syntax enforcement, have a look at this:
CREATE FUNCTION foo() RETURNS text AS $$
return """line 1
line 2
line 3
"""
$$ LANGUAGE plpythonu;
SELECT foo();
foo
--------------------------
line 1
line 2
line 3
(1 row)
Eh? Where'd those leading tabs come from? Why, they came from
PLy_procedure_munge_source() in src/pl/plpython/plpython.c:
mrc = PLy_malloc(mlen);
plen = snprintf(mrc, mlen, "def %s():\n\t", name);
Assert(plen >= 0 && plen < mlen);
sp = src;
mp = mrc + plen;
while (*sp != '\0')
{
if (*sp == '\n')
{
*mp++ = *sp++;
*mp++ = '\t';
}
else
*mp++ = *sp++;
}
*mp++ = '\n';
*mp++ = '\n';
*mp = '\0';
How about them apples? The PL/Python handler is already doing some
fixup behind the scenes (and potentially causing problems, as the
example illustrates).
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
Import Notes
Reply to msg id not found: Pine.LNX.4.61.0503161717290.20758@Megathlon.ESI20050316000043.GA68417@winnie.fuhr.org
On Wed, 16 Mar 2005, Michael Fuhr wrote:
[I've changed the Subject back to the thread that started this
discussion.]On Wed, Mar 16, 2005 at 05:52:02PM +0100, Marco Colombo wrote:
I'm against to any on-the-fly conversion, now.
I don't like the idea of PostgreSQL accepting input in one form
(\r\n) and providing output in a different form (\n). Also think of
a function definition with mixed \r\n and \n lines: we'd have no way
to reconstruct the original input.Yeah, that's a reasonable argument against modifying the function
source code before storing it in pg_proc. But I expect this problem
will come up again, and some people might not care about being able
to reconstruct the original input if it's just a matter of stripped
carriage returns, especially if the function logic doesn't use
literal carriage return characters that would be missed. For those
people, the validator hack might be an acceptable way to deal with
a client interface that inserts carriage returns that the programmer
didn't intend anyway. Not necessarily as part of the core PostgreSQL
code or even distributed with PostgreSQL, but as something they
could install if they wanted to.
Agreed.
I think we should just state that text used for function definitions
is \n-delimited. Some languages may accept \r\n as well, but that's
undocumented side effect, and bad practice.Whether it's an "undocumented side effect" depends on the language,
and whether it's bad practice is a matter of opinion.
Sure. I mean, we may just state that, per spec. Program data
should be \n-delimeted, full stop. It sounds sensible to me.
Just put it somewhere in the docs, problem solved. We're loosing
nothing. I'm just proposing to add that to the docs/specs.
In any case,
that's the language's concern and not something PostgreSQL should
judge or enforce. PostgreSQL shouldn't have to know or care about a
procedural language's syntax -- a function's source code should be an
opaque object that PostgreSQL stores and passes to the language's
handler without caring about its contents. Syntax enforcement should
be in the language's validator or handler according to the language's
own rules.
That's what we do now. My point being it's not our job to "fix" data
coming from the client. If a client app creates a plpython function
the wrong way, fix it. Why should we place a paperbag on a client bug?
Speaking of code munging and syntax enforcement, have a look at this:
CREATE FUNCTION foo() RETURNS text AS $$
return """line 1
line 2
line 3
"""
$$ LANGUAGE plpythonu;SELECT foo();
foo
--------------------------
line 1
line 2
line 3(1 row)
Eh? Where'd those leading tabs come from? Why, they came from
PLy_procedure_munge_source() in src/pl/plpython/plpython.c:mrc = PLy_malloc(mlen);
plen = snprintf(mrc, mlen, "def %s():\n\t", name);
Assert(plen >= 0 && plen < mlen);sp = src;
mp = mrc + plen;while (*sp != '\0')
{
if (*sp == '\n')
{
*mp++ = *sp++;
*mp++ = '\t';
}
else
*mp++ = *sp++;
}
*mp++ = '\n';
*mp++ = '\n';
*mp = '\0';How about them apples? The PL/Python handler is already doing some
fixup behind the scenes (and potentially causing problems, as the
example illustrates).
OMG! It's indenting the funtion body. I think you can't do that
w/o being syntax-aware. I'm not familiar with the code, why is it
adding a 'def' in front of it at all? I undestand that once you do
it you'll have to shift the code by an indentation level.
.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo@ESI.it
On Thu, Mar 17, 2005 at 01:03:36PM +0100, Marco Colombo wrote:
OMG! It's indenting the funtion body. I think you can't do that
w/o being syntax-aware. I'm not familiar with the code, why is it
adding a 'def' in front of it at all? I undestand that once you do
it you'll have to shift the code by an indentation level.
Presumbly because it wants to create a function, which can later be
called. Since python is sensetive to whitespace it has to indent the
code to make it work.
There was an example on the web somewhere (the link has been posted to
this list) of a peice of python which you can load into the interpreter
which will allow it to accept \r\n terminated lines. I don't recall if
anyone actually tried it out or not...
Won't fix the indenting problem though...
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes:
On Thu, Mar 17, 2005 at 01:03:36PM +0100, Marco Colombo wrote:
OMG! It's indenting the funtion body. I think you can't do that
w/o being syntax-aware. I'm not familiar with the code, why is it
adding a 'def' in front of it at all? I undestand that once you do
it you'll have to shift the code by an indentation level.
Presumbly because it wants to create a function, which can later be
called. Since python is sensetive to whitespace it has to indent the
code to make it work.
Seems like we have to upgrade that thing to have a complete
understanding of Python lexical rules --- at least enough to know where
the line boundaries are. Which is pretty much exactly the same as
knowing which CRs to strip out. So I guess we have a candidate place
for a solution.
Anyone want to code it up? I don't know enough Python to do it ...
regards, tom lane
pgsql@esiway.net (Marco Colombo) writes:
No I wasn't sure and I actually was wrong. I've never programmed under
Windows. I've just learned something.
Indeed, the Windows C runtime translates CRLF to \n on input, and \n
to CRLF on output, for files in "text" mode. Unix programmers tend
not to be aware of the distinction between text and binary modes
(it's actually in standard C) as it makes no difference on Unix. But
it does on Windows (and possibly other platforms).
<offtopic>
Ironically, at the lowest level, Windows behaves just like Unix (files
are pure byte streams) - it's only in the C runtime and application
code that CRLF issues arise, and that's a backward-compatibility hack
dating back to the days of MS-DOS.
</offtopic>
Apparently, as far as Python is concerned, the platform presents \n
at C level, so it makes sense for PyRun_SimpleString() to expect \n
as line terminator. Still I don't understand when the lexxer would
use \r\n as pysical line ending on Windows, but I can live with it. :-)
Internally, Python uses C string semantics, where \n represents a
newline. Recent versions of Python have "universal newline" support,
which in the broadest sense attempts to be forgiving over line
endings, and treat LF, CRLF, and even bare CR, as line endings. I
don't know exactly where it applies, though, so I believe the most
sensible approach is to always use \n (LF) in strings passed to
Python APIs. This is essentially the "be conservative in what you
send" philosophy.
Paul.
--
A little inaccuracy sometimes saves tons of explanation -- Saki
On Thursday 17 March 2005 23:17, Paul Moore wrote:
<offtopic>
Ironically, at the lowest level, Windows behaves just like Unix
(files are pure byte streams) - it's only in the C runtime and
application code that CRLF issues arise, and that's a
backward-compatibility hack dating back to the days of MS-DOS.
</offtopic>
Even more offtopic:
Actually, the CR/LF pair dates back to the ancient teletype writers,
which needed one character for the right-to-left movement of the paper
carriage (hence the literal meaning of "Carriage Return"), and one for
the vertical movement.
I believe it was Tom Swan who, in his "Programming Turbo Pascal" from
the eighties, said something to the effect that "this is not only a
case of the tail wagging the dog, but a tail that keeps on wagging
twenty years after the dog has rolled over and died."
Sorry-for-spinning-of-on-a-tangent-ly yours -
--
Leif Biberg Kristensen
http://solumslekt.org/
On Thu, Mar 17, 2005 at 10:49:24AM -0500, Tom Lane wrote:
Seems like we have to upgrade that thing to have a complete
understanding of Python lexical rules --- at least enough to know where
the line boundaries are. Which is pretty much exactly the same as
knowing which CRs to strip out. So I guess we have a candidate place
for a solution.Anyone want to code it up? I don't know enough Python to do it ...
[Sound of crickets]
More pabulum for pondering:
% cat -v foo.py
print '''line 1^M
line^M2^M
line 3^M
'''^M
% python foo.py | cat -v
line 1
line
2
line 3
% cat -v bar.py
print 'line 1^M'
% python bar.py
File "bar.py", line 1
print 'line 1
^
SyntaxError: EOL while scanning single-quoted string
Line-ending CRs stripped, even inside quotes; mid-line CRs converted
to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder
what Python on Windows would do. If it behaves the same way, then
a munging algorithm might be CRLF => LF, otherwise CR => LF. Or
we could take Marco's suggestion and do nothing, putting the burden
on the client to send the right thing.
That doesn't address the indentation munging, though. That appears
to be a matter of knowing whether you're inside a quote or not when
a LF appears.
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
Michael Fuhr <mike@fuhr.org> writes:
Line-ending CRs stripped, even inside quotes; mid-line CRs converted
to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder
what Python on Windows would do.
Unfortunately, I don't think that proves anything, because according
to earlier discussion Python will do newline-munging when it reads
a file (including a script file). The question that we have to deal
with is what are the rules for a string fed to PyRun_String ... and it
seems those rules are not the same.
regards, tom lane