BF assertion failure on mandrill in walsender, v13

Started by Thomas Munroover 4 years ago4 messages
#1Thomas Munro
thomas.munro@gmail.com

Hi,

Not sure if there is much chance of debugging this one-off failure in
without a backtrace (long shot: any chance there's still a core
file?), but for the record: mandrill choked on a null pointer passed
to GetMemoryChunkContext() inside a walsender running logical
replication. Possibly via pfree(NULL), but there are other paths.
That's an animal running with force_parallel_mode and
RANDOMIZE_ALLOCATED_MEMORY, on AIX with IBM compiler in 32 bit mode,
so unusual in several ways.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2021-06-06%2015:37:23

#2Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#1)
Re: BF assertion failure on mandrill in walsender, v13

On Thu, Jun 10, 2021 at 10:47:20AM +1200, Thomas Munro wrote:

Not sure if there is much chance of debugging this one-off failure in
without a backtrace (long shot: any chance there's still a core
file?)

No; it was probably in a directory deleted for each run. One would need to
add dbx support to the buildfarm client, or perhaps add support for saving
build directories when there's a core, so I can operate dbx manually.

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Noah Misch (#2)
Re: BF assertion failure on mandrill in walsender, v13

On 6/10/21 1:47 AM, Noah Misch wrote:

On Thu, Jun 10, 2021 at 10:47:20AM +1200, Thomas Munro wrote:

Not sure if there is much chance of debugging this one-off failure in
without a backtrace (long shot: any chance there's still a core
file?)

No; it was probably in a directory deleted for each run. One would need to
add dbx support to the buildfarm client, or perhaps add support for saving
build directories when there's a core, so I can operate dbx manually.

This is what the setting "keep_error_builds" does. In the END handler it
renames the build and install directories with a timestamp. Cleanup is
left to the user.

I don't have much knowledge of dbx, but I would take a patch for support.

cheers

andrew

--

Andrew Dunstan
EDB: https://www.enterprisedb.com

#4Noah Misch
noah@leadboat.com
In reply to: Andrew Dunstan (#3)
Re: BF assertion failure on mandrill in walsender, v13

On Thu, Jun 10, 2021 at 09:08:06AM -0400, Andrew Dunstan wrote:

On 6/10/21 1:47 AM, Noah Misch wrote:

On Thu, Jun 10, 2021 at 10:47:20AM +1200, Thomas Munro wrote:

Not sure if there is much chance of debugging this one-off failure in
without a backtrace (long shot: any chance there's still a core
file?)

No; it was probably in a directory deleted for each run. One would need to
add dbx support to the buildfarm client, or perhaps add support for saving
build directories when there's a core, so I can operate dbx manually.

This is what the setting "keep_error_builds" does. In the END handler it
renames the build and install directories with a timestamp. Cleanup is
left to the user.

Great. The machine has ample disk, so I'll add that setting.