73.5 and uw 713

Started by Nonameabout 22 years ago7 messages
#1Noname
ohp@pyrenet.fr

Hi all,

I've upgraded my system from 7.3.4 to 7.3.5 yesterday and have already
experienced to crash during vacuum full.

I have'nt recompiled with debug yet but it's a sigsegv in function
repair_frag in vacuum.c

Does it ring a bell?

Regards
--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: 73.5 and uw 713

ohp@pyrenet.fr writes:

I've upgraded my system from 7.3.4 to 7.3.5 yesterday and have already
experienced to crash during vacuum full.
I have'nt recompiled with debug yet but it's a sigsegv in function
repair_frag in vacuum.c

Considering that vacuum.c hasn't changed in that branch since 7.3beta4,
it's highly unlikely that this represents a regression between 7.3.4 and
7.3.5. Pre-existing bug, maybe ...

regards, tom lane

#3Noname
ohp@pyrenet.fr
In reply to: Tom Lane (#2)
Re: 73.5 and uw 713

Is there ay way I can help with this debugging?
On Mon, 8 Dec 2003, Tom Lane wrote:

Date: Mon, 08 Dec 2003 14:03:42 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: ohp@pyrenet.fr
Cc: pgsql-hackers list <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] 73.5 and uw 713

ohp@pyrenet.fr writes:

I've upgraded my system from 7.3.4 to 7.3.5 yesterday and have already
experienced to crash during vacuum full.
I have'nt recompiled with debug yet but it's a sigsegv in function
repair_frag in vacuum.c

Considering that vacuum.c hasn't changed in that branch since 7.3beta4,
it's highly unlikely that this represents a regression between 7.3.4 and
7.3.5. Pre-existing bug, maybe ...

regards, tom lane

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

#4Neil Conway
neilc@samurai.com
In reply to: Noname (#3)
Re: 73.5 and uw 713

ohp@pyrenet.fr writes:

Is there ay way I can help with this debugging?

Can you speculate on what might have caused the crash?

Is the crash reproducible?

When the backend crashed, it should have produced a core file
(assuming your system is configured to do so). Can you post the
stacktrace you can get from this core file (preferably after you've
recompiled PG with debugging symbols) and post it to the list?

-Neil

#5Noname
ohp@pyrenet.fr
In reply to: Neil Conway (#4)
Re: 73.5 and uw 713

Hi Neil and Tom
On Mon, 8 Dec 2003, Neil Conway wrote:

Date: Mon, 08 Dec 2003 22:44:42 -0500
From: Neil Conway <neilc@samurai.com>
To: ohp@pyrenet.fr
Cc: Tom Lane <tgl@sss.pgh.pa.us>,
pgsql-hackers list <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] 73.5 and uw 713

ohp@pyrenet.fr writes:

Is there ay way I can help with this debugging?

Can you speculate on what might have caused the crash?

On a second tought, this has been compiled with the new SCO compiler...
The one Larry removed -Kno_host for. Dunno if it's related.

Is the crash reproducible?

Yes.. On certain databases it'll ALWAYS crash

When the backend crashed, it should have produced a core file
(assuming your system is configured to do so). Can you post the
stacktrace you can get from this core file (preferably after you've
recompiled PG with debugging symbols) and post it to the list?

That's the problem, I had another crash this night at 2:30 am (I vacuumdb
-a all databases at that time)
I decided to recompile everything with -debug turned on and could'nt
reproduce any more. It remembers me the crash I had with pg_dump last
summer...

If I can manage to get a good stack trace, I'll post it

-Neil

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

#6Noname
ohp@pyrenet.fr
In reply to: Noname (#1)
Re: 73.5 and uw 713

All right Tom,
I managed to get a trace:

Script started on Tue Dec 9 23:02:13 2003
# debug -c base/2308232/core.2509 39 /usr/local/pgsql/bin/postmaster
Avertissement: Fichier image m�moire tronqu�
Erreur: Impossible de trouver le segment de m�moire associ� � l'adresse
0xbfffd00c dans le processus p1
Image m�moire de postmaster (processus p1) cr��e
Erreur: Top stack frame invalid, program counter out of range
Avertissement: Stack adjusted to start with previous frame
FICHIER IMAGE MEMOIRE [AllocSetFree dans aset.c]
11 (segv code[SEGV_MAPERR] address[0x8426000]) SIGNALE dans p1
782: int fidx = AllocSetFreeIndex(chunk->size);
debug> stack
Suivi de pile correspondant � p1, Programme postmaster
*[0] AllocSetFree(context=0x2, pointer=0x80469e4, pr�sum�: 0x80ea706) [aset.c@782]
[1]: ?() [0xbffae010] script done on Tue Dec 9 23:06:05 2003
script done on Tue Dec 9 23:06:05 2003

Not sure it helps...

Regards
On Mon, 8 Dec 2003, Tom Lane wrote:

Date: Mon, 08 Dec 2003 15:49:05 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: ohp@pyrenet.fr
Subject: Re: [HACKERS] 73.5 and uw 713

Is there ay way I can help with this debugging?

Well, for starters, how about that debug backtrace?

regards, tom lane

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

#7Noname
ohp@pyrenet.fr
In reply to: Tom Lane (#2)
Re: 73.5 and uw 713

Hi Tom,

At last I have a much better trace for the vacuum full bug.

Can some one help me on this one?

Image m�moire de postmaster (processus p1) cr��e
FICHIER IMAGE MEMOIRE [swapn dans qsort.c]
11 (segv code[SEGV_MAPERR] address[0x8420000]) SIGNALE dans p1
0xbffae03f (swapn+47:) movl (%esi),%eax
debug> Suivi de pile correspondant � p1, Programme postmaster
*[0] swapn(0x2, 0x831b758, 0x831b770) [0xbffae03f]
[1]: qst(0x80448cc, 0x831b758, 0x831b788) [0xbffadca2]
[2]: qsort(0x831b758, 0x18, 0x2, 0x80eb9f8) [0xbffae17f]
[3]: repair_frag(vacrelstats=0x83122bc, onerel=0x82cf56c, vacuum_pages=0x8046a64, fraged_pages=0x8046a54, nindexes=1, Irel=0x83672e0) [vacuum.c@2227]
[4]: full_vacuum_rel(onerel=0x82cf56c, vacstmt=0x83104b4) [vacuum.c@955]
[5]: vacuum_rel(relid=16408, vacstmt=0x83104b4, expected_relkind=114 (or 'r')) [vacuum.c@827]
[6]: vacuum(vacstmt=0x83104b4) [vacuum.c@290]
[7]: ProcessUtility(parsetree=0x83104b4, dest=Remote, completionTag="") [utility.c@gram.y@713]
[8]: pg_exec_query_string(query_string=0x831020c, dest=Remote, parse_context=0x830e204) [postgres.c@gram.y@789]
[9]: PostgresMain(argc=4, argv=0x8046d78, username="ohp") [postgres.c@gram.y@2013]
[10]: DoBackend(port=0x829e500) [postmaster.c@2310]
[11]: BackendStartup(port=0x829e500) [postmaster.c@1932]
[12]: ServerLoop( pr�sum�: 0x1, 0x8297af8, 0x1) [postmaster.c@1009]
[13]: PostmasterMain(argc=1, argv=0x8297af8) [postmaster.c@788]
[14]: main(argc=1, argv=0x8047c44, 0x8047c4c) [main.c@210]
[15]: _start() [0x806ad1c] debug>
debug>

On Mon, 8 Dec 2003, Tom Lane wrote:

Date: Mon, 08 Dec 2003 14:03:42 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: ohp@pyrenet.fr
Cc: pgsql-hackers list <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] 73.5 and uw 713

ohp@pyrenet.fr writes:

I've upgraded my system from 7.3.4 to 7.3.5 yesterday and have already
experienced to crash during vacuum full.
I have'nt recompiled with debug yet but it's a sigsegv in function
repair_frag in vacuum.c

Considering that vacuum.c hasn't changed in that branch since 7.3beta4,
it's highly unlikely that this represents a regression between 7.3.4 and
7.3.5. Pre-existing bug, maybe ...

regards, tom lane

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)