Fix for psql core dumping on bad user
The postgresql interactive terminal will dump core on any script that is
run via the -f command line option if their exists a connect line without
a valid user. An example connect line is in one of the attached files.
The user that I have choosen is just testuser, you will see what I mean if
you just run that script on any database you have on your system, assuming
that you don't have a user called testuser. If you do, change the
username and see what happens. This bug has been in psql for the last
couple of versions that I have tested.
Also attached is a patch that seems to correct the issue. I admit that I
haven't studied the code long enough to determine if the fix is suitable,
but I feel that it will give you something to work from if it is not.
--
//===================================================================\\
|| D. Hageman <dhageman@dracken.com> ||
\\===================================================================//
Attachments:
psql-core.patchtext/plain; charset=US-ASCII; name=psql-core.patchDownload
diff -ruN postgresql-7.1/src/bin/psql/command.c postgresql-7.1.patched/src/bin/psql/command.c
--- postgresql-7.1/src/bin/psql/command.c Fri Mar 23 18:54:56 2001
+++ postgresql-7.1.patched/src/bin/psql/command.c Tue Apr 17 14:36:56 2001
@@ -1281,8 +1281,17 @@
*/
psql_error("\\connect: %s", PQerrorMessage(pset.db));
PQfinish(pset.db);
+
+ /*
+ * if at this point, we have an old connection, clean it up
+ * and exit cleanly. the script that they are running needs
+ * to be corrected.
+ */
if (oldconn)
+ {
PQfinish(oldconn);
+ exit(EXIT_FAILURE);
+ }
pset.db = NULL;
}
}
"D. Hageman" <dhageman@dracken.com> writes:
The postgresql interactive terminal will dump core on any script that is
run via the -f command line option if their exists a connect line without
a valid user.
Curiously, I see no core dump here:
$ cat zscript
\connect - testuser
$ psql -f zscript regression
psql:zscript:1: \connect: FATAL 1: user "testuser" does not exist
$
Nonetheless, the comment at the top of do_connect() says that it
*should* terminate the program under these circumstances, so I'm not
sure why it doesn't. Peter?
regards, tom lane
Strange. Maybe I haven't fully explored the problem then. I would be
more then happy to supply a core file if you would like to analyze it. I
also guess that I should have been more complete in my bug report. I am
doing this on a RedHat 6.2 (Fully updated, Intel architecture) machine and
I have seen this behavior in the past several versions of PostgreSQL, but
just have now gotten around to doing something about it. As far as how I
compile it, I usually use the roll rpms from the srpms that you create Tom?
I think I will go ahead and try it out on some other platforms later on
today ...
On Tue, 17 Apr 2001, Tom Lane wrote:
"D. Hageman" <dhageman@dracken.com> writes:
The postgresql interactive terminal will dump core on any script that is
run via the -f command line option if their exists a connect line without
a valid user.Curiously, I see no core dump here:
$ cat zscript
\connect - testuser
$ psql -f zscript regression
psql:zscript:1: \connect: FATAL 1: user "testuser" does not exist
$Nonetheless, the comment at the top of do_connect() says that it
*should* terminate the program under these circumstances, so I'm not
sure why it doesn't. Peter?regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
--
//===================================================================\\
|| D. Hageman <dhageman@dracken.com> ||
\\===================================================================//
D. Hageman writes:
Strange. Maybe I haven't fully explored the problem then. I would be
more then happy to supply a core file if you would like to analyze it.
Please compile with debug symbols, e.g.
src/bin/psql$ make clean
src/bin/psql$ make CFLAGS=-g all
then produce the core dump and run
gdb location/bin/psql some/where/core
and enter
bt
and show what it says.
--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
I just tried it on Alpha hardware running FreeBSD. Same results.
[~/opt/bin]
dhageman@marconi: ./psql -f test.sql test
psql:test.sql:1: \connect: FATAL 1: user "testuser" does not exist
Illegal instruction (core dumped)
At any rate, I am convinced that I am not going crazy here with the
results I saw on my normal database system.
On Wed, 18 Apr 2001, D. Hageman wrote:
Strange. Maybe I haven't fully explored the problem then. I would be
more then happy to supply a core file if you would like to analyze it. I
also guess that I should have been more complete in my bug report. I am
doing this on a RedHat 6.2 (Fully updated, Intel architecture) machine and
I have seen this behavior in the past several versions of PostgreSQL, but
just have now gotten around to doing something about it. As far as how I
compile it, I usually use the roll rpms from the srpms that you create Tom?I think I will go ahead and try it out on some other platforms later on
today ...On Tue, 17 Apr 2001, Tom Lane wrote:
"D. Hageman" <dhageman@dracken.com> writes:
The postgresql interactive terminal will dump core on any script that is
run via the -f command line option if their exists a connect line without
a valid user.Curiously, I see no core dump here:
$ cat zscript
\connect - testuser
$ psql -f zscript regression
psql:zscript:1: \connect: FATAL 1: user "testuser" does not exist
$
--
//===================================================================\\
|| D. Hageman <dhageman@dracken.com> ||
\\===================================================================//
Tom Lane writes:
$ cat zscript
\connect - testuser
$ psql -f zscript regression
psql:zscript:1: \connect: FATAL 1: user "testuser" does not exist
$Nonetheless, the comment at the top of do_connect() says that it
*should* terminate the program under these circumstances, so I'm not
sure why it doesn't. Peter?
The comment is not correct. Failure in do_connect() in non-interactive
mode terminates the script. In the case of -f the program terminates
implicitly, but in case of \i you would return to the prompt (or the
containing \i).
--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
D. Hageman writes:
The postgresql interactive terminal will dump core on any script that is
run via the -f command line option if their exists a connect line without
a valid user. An example connect line is in one of the attached files.
Okay, I've found the problem. When the connection fails, psql momentarily
runs without a valid database connection. When it does that, the
multibyte encoding has the invalid value -1. (You need to compile with
multibyte enabled to reproduce this.) With that value, PQmblen() has
trouble when it parses the next line. Perhaps PQmblen() should simply
return 1 when it is passed an invalid encoding. In any case it should do
better than dump core.
--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
D. Hageman writes:
The postgresql interactive terminal will dump core on any script that is
run via the -f command line option if their exists a connect line without
a valid user. An example connect line is in one of the attached files.Okay, I've found the problem. When the connection fails, psql momentarily
runs without a valid database connection. When it does that, the
multibyte encoding has the invalid value -1. (You need to compile with
multibyte enabled to reproduce this.) With that value, PQmblen() has
trouble when it parses the next line. Perhaps PQmblen() should simply
return 1 when it is passed an invalid encoding. In any case it should do
better than dump core.
Will fix. Also I will change the "invalid" encoding to a
default i.e. SQL_ASCII, not -1.
--
Tatsuo Ishii
Attached is the backtrace from gdb. I didn't find it very helpful when I
first looked into this problem, but maybe you can see something that I
missed.
I think tomorrow at work, I will take some time out to step through the
code and see exactly what is going on in this situation. I get the
impression from the responses that I recieved that my quick analysis is
wrong and problem exists else where ... at any rate, time for bed.
On Wed, 18 Apr 2001, Peter Eisentraut wrote:
D. Hageman writes:
Strange. Maybe I haven't fully explored the problem then. I would be
more then happy to supply a core file if you would like to analyze it.Please compile with debug symbols, e.g.
src/bin/psql$ make clean
src/bin/psql$ make CFLAGS=-g allthen produce the core dump and run
gdb location/bin/psql some/where/core
and enter
bt
and show what it says.
--
//===================================================================\\
|| D. Hageman <dhageman@dracken.com> ||
\\===================================================================//
Attachments:
Attached is the backtrace from gdb. I didn't find it very helpful when I
first looked into this problem, but maybe you can see something that I
missed.I think tomorrow at work, I will take some time out to step through the
code and see exactly what is going on in this situation. I get the
impression from the responses that I recieved that my quick analysis is
wrong and problem exists else where ... at any rate, time for bed.
I have commited a fix. Please grab the snapshot and try again.
--
Tatsuo Ishii
On Thu, 19 Apr 2001, Tatsuo Ishii wrote:
I have commited a fix. Please grab the snapshot and try again.
--
Tatsuo Ishii
The results are much much better. No core dumping at all. Thank you for
your help with this. Not that it was a major bug, but I like to help make
open source projects better whenever I can.
[dhageman@typhon psql]$ ./psql -f test.sql test
psql:test.sql:1: \connect: FATAL 1: user "testuser" does not exist
[dhageman@typhon psql]$ echo $?
2
[dhageman@typhon psql]$
--
//===================================================================\\
|| D. Hageman <dhageman@dracken.com> ||
\\===================================================================//