7.2.1 segfaults.

Started by Stephen Amadeiabout 24 years ago7 messagesbugs

amadei@dandy.net

about 24 years ago

Hello.

I am new to the bugs list, so I hope this hasn't been covered before, but
I have a Slackware 8.0 system that is fairly up to date. Possibly to a
fault. Glibc is 2.2.5, gcc is 2.95.3, zlib is 1.1.4, kernel is 2.4.18
with the GRSecurity patch.

I have been trying to run Postgres 7.2.1 in a chrooted environment, but
once I try to connect to the server with a "psql -l", it segfaults as it
tries to read from the data/global/1262. I ran Postgres out of the chroot
and with a non-GRSecurity kernel, and it still segfaults.

I saw simular problems with a few other applications that use libz, since
I upgraded to 1.1.4 for security reasons. If I compile Postgres 7.1.3,
it runs fine... Any ideas?

----Steve
Stephen Amadei
Dandy.NET! CTO
Atlantic City, NJ

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Stephen Amadei (#1)

Re: 7.2.1 segfaults.

Stephen Amadei <amadei@dandy.net> writes:

I have been trying to run Postgres 7.2.1 in a chrooted environment, but
once I try to connect to the server with a "psql -l", it segfaults as it
tries to read from the data/global/1262.

Urgh. Can you provide a stack trace?

regards, tom lane

Stephen Amadei

amadei@dandy.net

about 24 years ago

In reply to: Tom Lane (#2)

Re: 7.2.1 segfaults.

On Fri, 3 May 2002, Tom Lane wrote:

Stephen Amadei <amadei@dandy.net> writes:

I have been trying to run Postgres 7.2.1 in a chrooted environment, but
once I try to connect to the server with a "psql -l", it segfaults as it
tries to read from the data/global/1262.

Urgh. Can you provide a stack trace?

You mean using strace? Yeah. The strace created quite a bit of logs, but
the process that segfaulted is included below.

If you need more, let me know.

----Steve
Stephen Amadei
Dandy.NET! CTO
Atlantic City, NJ

--------------------------------------------------

close(4) = 0
close(5) = 0
close(7) = 0
close(8)                                = 0
getpid()                                = 5217
rt_sigaction(SIGTERM, {0x810cfe8, [], SA_RESTART|0x4000000}, {0x80f45c8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGQUIT, {0x810cfe8, [], SA_RESTART|0x4000000}, {0x80f45c8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGALRM, {0x810cfe8, [], 0x4000000}, {SIG_IGN}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE SEGV ALRM TERM CONT UNUSED], NULL, 8) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={60, 0}}, {it_interval={0, 0}, it_value={0, 0}}) = 0
recv(9, "\0\0\1(\0\2\0\0template1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 0) = 296
getpid()                                = 5217
send(9, "R\0\0\0\0", 5, 0)              = 5
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}}, {it_interval={0, 0}, it_value={60, 0}}) = 0
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT UNUSED], NULL, 8) = 0
write(2, "DEBUG:  connection: host=209.128"..., 69) = 69
gettimeofday({1020378083, 363259}, {240, 0}) = 0
rt_sigaction(SIGHUP, {0x810d0a0, [], SA_RESTART|0x4000000}, {0x80f4564, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGINT, {0x810cff8, [], SA_RESTART|0x4000000}, {0x80f45c8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGTERM, {0x810cf78, [], SA_RESTART|0x4000000}, {0x810cfe8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGQUIT, {0x810cf44, [], SA_RESTART|0x4000000}, {0x810cfe8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGALRM, {0x8108ef0, [], 0x4000000}, {0x810cfe8, [], 0x4000000}, 8) = 0
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGUSR1, {SIG_IGN}, {0x80f5480, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGUSR2, {0x80a8294, [], SA_RESTART|0x4000000}, {0x80f554c, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGFPE, {0x810d088, [], SA_RESTART|0x4000000}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGCHLD, {SIG_DFL}, {0x80f4808, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGTTIN, {SIG_DFL}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGTTOU, {SIG_DFL}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGCONT, {SIG_DFL}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGWINCH, {SIG_DFL}, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE SEGV CONT UNUSED], NULL, 8) = 0
fcntl(2, F_GETFD)                       = 0
brk(0x820a000)                          = 0x820a000
brk(0x820d000)                          = 0x820d000
brk(0x8214000)                          = 0x8214000
open("/usr/local/pgsql/data/global/1262", O_RDONLY) = 4
read(4, "\0\0\0\0\f\222\20\0\7\0\0\0\34\0\244\37\0 \0 \244\37\0"..., 8192) = 8192
--- SIGSEGV (Segmentation fault) ---

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Stephen Amadei (#3)

Re: 7.2.1 segfaults.

Stephen Amadei <amadei@dandy.net> writes:

Urgh. Can you provide a stack trace?

You mean using strace? Yeah. The strace created quite a bit of logs, but
the process that segfaulted is included below.

open("/usr/local/pgsql/data/global/1262", O_RDONLY) = 4
read(4, "\0\0\0\0\f\222\20\0\7\0\0\0\34\0\244\37\0 \0 \244\37\0"..., 8192) = 8192
--- SIGSEGV (Segmentation fault) ---

Hmm, this does not square with your prior statement that it's a chroot
can't-call-/bin/cp issue. Would you set things up to allow a core dump
(ie, not ulimit -c 0) and then do "gdb postgres-executable corefile"
followed by "bt"?

regards, tom lane

Stephen Amadei

amadei@dandy.net

about 24 years ago

In reply to: Tom Lane (#4)

Re: 7.2.1 segfaults.

On Fri, 3 May 2002, Tom Lane wrote:

Stephen Amadei <amadei@dandy.net> writes:

Urgh. Can you provide a stack trace?

You mean using strace? Yeah. The strace created quite a bit of logs, but
the process that segfaulted is included below.
open("/usr/local/pgsql/data/global/1262", O_RDONLY) = 4
read(4, "\0\0\0\0\f\222\20\0\7\0\0\0\34\0\244\37\0 \0 \244\37\0"..., 8192) = 8192
--- SIGSEGV (Segmentation fault) ---
Hmm, this does not square with your prior statement that it's a chroot
can't-call-/bin/cp issue.

It's not. I don't mean to confuse the two separate problems, that's why
I made two threads. In order to be sure that neither GRSecurity or the
chroot was causing the segfault, I disabled these features and ran
postmaster as a normal user would... but I still connected via TCPIP.

Would you set things up to allow a core dump
(ie, not ulimit -c 0) and then do "gdb postgres-executable corefile"
followed by "bt"?

Uh... sure. This will take a moment.

O.K... I think I have the info.

#0 0x255843 in strncpy (s1=0xbfffead0 "n\013", s2=0x8213414 "n\013",
n=4294967292) at ../sysdeps/generic/strncpy.c:82
#1 0x81516ab in GetRawDatabaseInfo ()
#2 0x81511fb in InitPostgres ()

I am not real familiar with gdb, so I only vaguely know what this shows,
besides stack. And in the above info, the 'n' in "n\013" actually has a
little '~' above it, but I figured that character might get managed by the
email.

----Steve
Stephen Amadei
Dandy.NET! CTO
Atlantic City, NJ

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Stephen Amadei (#5)

Re: 7.2.1 segfaults.

Stephen Amadei <amadei@dandy.net> writes:

#0 0x255843 in strncpy (s1=0xbfffead0 "n\013", s2=0x8213414 "n\013",
n=4294967292) at ../sysdeps/generic/strncpy.c:82
#1 0x81516ab in GetRawDatabaseInfo ()
#2 0x81511fb in InitPostgres ()

Hmm. It looks like GetRawDatabaseInfo is reading a zero for the VARSIZE
of datpath, and then computing -4 (which strncpy will take as a huge
unsigned value) as the string length to copy. You could try applying
a patch like this, in src/backend/utils/misc/database.c (about line
225 in current sources):

                /* Found it; extract the OID and the database path. */
                *db_id = tup.t_data->t_oid;
                pathlen = VARSIZE(&(tup_db->datpath)) - VARHDRSZ;
+               if (pathlen < 0)
+                   pathlen = 0;                /* pure paranoia */
                if (pathlen >= MAXPGPATH)
                    pathlen = MAXPGPATH - 1;    /* pure paranoia */
                strncpy(path, VARDATA(&(tup_db->datpath)), pathlen);
                path[pathlen] = '\0';

However this really shouldn't be needed; I'm wondering whether the
database's row in pg_database has been clobbered somehow. If so,
it probably won't get much further before dying.

Two questions: does the same thing happen for all available databases?
Have you tried to create a database with a nonstandard location
(nondefault datpath)?

regards, tom lane

Stephen Amadei

amadei@dandy.net

about 24 years ago

In reply to: Tom Lane (#6)

Re: 7.2.1 segfaults.

On Sat, 4 May 2002, Tom Lane wrote:

Hmm. It looks like GetRawDatabaseInfo is reading a zero for the VARSIZE
of datpath, and then computing -4 (which strncpy will take as a huge
unsigned value) as the string length to copy. You could try applying
a patch like this, in src/backend/utils/misc/database.c (about line
225 in current sources):

Wierd.

However this really shouldn't be needed; I'm wondering whether the
database's row in pg_database has been clobbered somehow. If so,
it probably won't get much further before dying.

Good point. And deleting the $PGDATA directory and recreating it
fixed it without a patch.

Two questions: does the same thing happen for all available databases?
Have you tried to create a database with a nonstandard location
(nondefault datpath)?

No... I was creating the database in /usr/local/pgsql/data and then
'cp -aRp'ing it into the chroot. So I had two copies of the same corrupt
database.

Thanks for the help.

----Steve
Stephen Amadei
Dandy.NET! CTO
Atlantic City, NJ