Weird failures on lorikeet

Started by Andres Freundalmost 8 years ago5 messages
#1Andres Freund
andres@anarazel.de

Hi Andrew,

I noticed your animal lorikeet failed in the last two runs:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2018-02-21%2009%3A47%3A17
TRAP: FailedAssertion("!(((PageHeader) (page))->pd_special >= (__builtin_offsetof (PageHeaderData, pd_linp)))", File: "/home/andrew/bf64/root/HEAD/pgsql/src/include/storage/bufpage.h", Line: 313)

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2018-02-20%2012%3A46%3A17
TRAP: FailedAssertion("!(PMSignalState->PMChildFlags[slot] == 1)", File: "/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/pmsignal.c", Line: 229)
2018-02-20 08:07:14.054 EST [5a8c1c3b.21d0:3] LOG: select() failed in postmaster: Bad address
2018-02-20 08:07:14.073 EST [5a8c1c3b.21d0:4] LOG: database system is shut down

The difference between the last successfull and the last failing build
is a single comment typo commit.

It kinda looks like there might be some underlying issue on the machine
with shared memory going away or such?

Greetings,

Andres Freund

#2Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Andres Freund (#1)
Re: Weird failures on lorikeet

On Thu, Feb 22, 2018 at 7:06 AM, Andres Freund <andres@anarazel.de> wrote:

Hi Andrew,

I noticed your animal lorikeet failed in the last two runs:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&amp;dt=2018-02-21%2009%3A47%3A17
TRAP: FailedAssertion("!(((PageHeader) (page))->pd_special >= (__builtin_offsetof (PageHeaderData, pd_linp)))", File: "/home/andrew/bf64/root/HEAD/pgsql/src/include/storage/bufpage.h", Line: 313)

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&amp;dt=2018-02-20%2012%3A46%3A17
TRAP: FailedAssertion("!(PMSignalState->PMChildFlags[slot] == 1)", File: "/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/pmsignal.c", Line: 229)
2018-02-20 08:07:14.054 EST [5a8c1c3b.21d0:3] LOG: select() failed in postmaster: Bad address
2018-02-20 08:07:14.073 EST [5a8c1c3b.21d0:4] LOG: database system is shut down

The difference between the last successfull and the last failing build
is a single comment typo commit.

It kinda looks like there might be some underlying issue on the machine
with shared memory going away or such?

Bad memory? Assorted output from recent runs:

+ ERROR: index "pg_toast_28546_index" contains corrupted page at block 1

TRAP: FailedAssertion("!(PMSignalState->PMChildFlags[slot] == 1)",
File: "/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/pmsignal.c",
Line: 229)

TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File:
"/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/postmaster/pgstat.c",
Line: 871)

--
Thomas Munro
http://www.enterprisedb.com

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Thomas Munro (#2)
Re: Weird failures on lorikeet

Is there something going on with lorikeet again? I see this:

2020-06-25 01:55:13.380 EDT [5ef43c40.21e0:85] pg_regress/typed_table LOG: statement: SELECT c.oid::pg_catalog.regclass
FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i
WHERE c.oid = i.inhparent AND i.inhrelid = '21420'
AND c.relkind != 'p' AND c.relkind != 'I'
ORDER BY inhseqno;
*** starting debugger for pid 4456, tid 2644
2020-06-25 01:55:13.393 EDT [5ef43c40.21e0:86] pg_regress/typed_table LOG: statement: SELECT c.oid::pg_catalog.regclass, c.relkind, pg_catalog.pg_get_expr(c.relpartbound, c.oid)
FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i
WHERE c.oid = i.inhrelid AND i.inhparent = '21420'
ORDER BY pg_catalog.pg_get_expr(c.relpartbound, c.oid) = 'DEFAULT', c.oid::pg_catalog.regclass::pg_catalog.text;
1 [main] postgres 4456 try_to_debug: Failed to start debugger, Win32 error 2
2020-06-25 01:55:13.455 EDT [5ef43c40.21e0:87] pg_regress/typed_table LOG: statement: CREATE TABLE persons3 OF person_type (
PRIMARY KEY (id),
name NOT NULL DEFAULT ''
);
*** continuing pid 4456 from debugger call (0)
*** starting debugger for pid 4456, tid 2644
48849 [main] postgres 4456 try_to_debug: Failed to start debugger, Win32 error 2
*** continuing pid 4456 from debugger call (0)
2020-06-25 01:55:18.181 EDT [5ef43bd6.2824:4] LOG: server process (PID 4456) was terminated by signal 11: Segmentation fault
2020-06-25 01:55:18.181 EDT [5ef43bd6.2824:5] DETAIL: Failed process was running: drop table some_tab cascade;

On 2018-Mar-06, Thomas Munro wrote:

On Thu, Feb 22, 2018 at 7:06 AM, Andres Freund <andres@anarazel.de> wrote:

I noticed your animal lorikeet failed in the last two runs:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&amp;dt=2018-02-21%2009%3A47%3A17
TRAP: FailedAssertion("!(((PageHeader) (page))->pd_special >= (__builtin_offsetof (PageHeaderData, pd_linp)))", File: "/home/andrew/bf64/root/HEAD/pgsql/src/include/storage/bufpage.h", Line: 313)

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&amp;dt=2018-02-20%2012%3A46%3A17
TRAP: FailedAssertion("!(PMSignalState->PMChildFlags[slot] == 1)", File: "/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/pmsignal.c", Line: 229)
2018-02-20 08:07:14.054 EST [5a8c1c3b.21d0:3] LOG: select() failed in postmaster: Bad address
2018-02-20 08:07:14.073 EST [5a8c1c3b.21d0:4] LOG: database system is shut down

Bad memory? Assorted output from recent runs:

+ ERROR: index "pg_toast_28546_index" contains corrupted page at block 1

TRAP: FailedAssertion("!(PMSignalState->PMChildFlags[slot] == 1)",
File: "/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/pmsignal.c",
Line: 229)

TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File:
"/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/postmaster/pgstat.c",
Line: 871)

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Alvaro Herrera (#3)
Re: Weird failures on lorikeet

On 6/25/20 12:52 PM, Alvaro Herrera wrote:

Is there something going on with lorikeet again? I see this:

2020-06-25 01:55:13.380 EDT [5ef43c40.21e0:85] pg_regress/typed_table LOG: statement: SELECT c.oid::pg_catalog.regclass
FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i
WHERE c.oid = i.inhparent AND i.inhrelid = '21420'
AND c.relkind != 'p' AND c.relkind != 'I'
ORDER BY inhseqno;
*** starting debugger for pid 4456, tid 2644
2020-06-25 01:55:13.393 EDT [5ef43c40.21e0:86] pg_regress/typed_table LOG: statement: SELECT c.oid::pg_catalog.regclass, c.relkind, pg_catalog.pg_get_expr(c.relpartbound, c.oid)
FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i
WHERE c.oid = i.inhrelid AND i.inhparent = '21420'
ORDER BY pg_catalog.pg_get_expr(c.relpartbound, c.oid) = 'DEFAULT', c.oid::pg_catalog.regclass::pg_catalog.text;
1 [main] postgres 4456 try_to_debug: Failed to start debugger, Win32 error 2
2020-06-25 01:55:13.455 EDT [5ef43c40.21e0:87] pg_regress/typed_table LOG: statement: CREATE TABLE persons3 OF person_type (
PRIMARY KEY (id),
name NOT NULL DEFAULT ''
);
*** continuing pid 4456 from debugger call (0)
*** starting debugger for pid 4456, tid 2644
48849 [main] postgres 4456 try_to_debug: Failed to start debugger, Win32 error 2
*** continuing pid 4456 from debugger call (0)
2020-06-25 01:55:18.181 EDT [5ef43bd6.2824:4] LOG: server process (PID 4456) was terminated by signal 11: Segmentation fault
2020-06-25 01:55:18.181 EDT [5ef43bd6.2824:5] DETAIL: Failed process was running: drop table some_tab cascade;

I don't know what that's about. I'll reboot the machine presently and
see if it persists.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Andrew Dunstan (#4)
Re: Weird failures on lorikeet

On 6/25/20 2:42 PM, Andrew Dunstan wrote:

On 6/25/20 12:52 PM, Alvaro Herrera wrote:

Is there something going on with lorikeet again? I see this:

2020-06-25 01:55:13.380 EDT [5ef43c40.21e0:85] pg_regress/typed_table LOG: statement: SELECT c.oid::pg_catalog.regclass
FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i
WHERE c.oid = i.inhparent AND i.inhrelid = '21420'
AND c.relkind != 'p' AND c.relkind != 'I'
ORDER BY inhseqno;
*** starting debugger for pid 4456, tid 2644
2020-06-25 01:55:13.393 EDT [5ef43c40.21e0:86] pg_regress/typed_table LOG: statement: SELECT c.oid::pg_catalog.regclass, c.relkind, pg_catalog.pg_get_expr(c.relpartbound, c.oid)
FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i
WHERE c.oid = i.inhrelid AND i.inhparent = '21420'
ORDER BY pg_catalog.pg_get_expr(c.relpartbound, c.oid) = 'DEFAULT', c.oid::pg_catalog.regclass::pg_catalog.text;
1 [main] postgres 4456 try_to_debug: Failed to start debugger, Win32 error 2
2020-06-25 01:55:13.455 EDT [5ef43c40.21e0:87] pg_regress/typed_table LOG: statement: CREATE TABLE persons3 OF person_type (
PRIMARY KEY (id),
name NOT NULL DEFAULT ''
);
*** continuing pid 4456 from debugger call (0)
*** starting debugger for pid 4456, tid 2644
48849 [main] postgres 4456 try_to_debug: Failed to start debugger, Win32 error 2
*** continuing pid 4456 from debugger call (0)
2020-06-25 01:55:18.181 EDT [5ef43bd6.2824:4] LOG: server process (PID 4456) was terminated by signal 11: Segmentation fault
2020-06-25 01:55:18.181 EDT [5ef43bd6.2824:5] DETAIL: Failed process was running: drop table some_tab cascade;

I don't know what that's about. I'll reboot the machine presently and
see if it persists.

I've disabled the core dumper, because I've never got it working
properly anyway. But we might well still get the segfault.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services