pg_dumpall bug exit code 0 with fatal

Started by Владимир Фролов8 days ago2 messagesbugs
Jump to latest
#1Владимир Фролов
frolov.vova@gmail.com

I have installed two nodes cluster- PostgreSQL 16.3 (Debian
16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian
12.2.0-14) 12.2.0, 64-bit with wal logical replication. When i run
pgdump_all - i have error -
pg_dump: error: Dumping the contents of table "*****" failed: PQgetResult()
failed.
pg_dump: detail: Error message from server: ERROR: canceling statement due
to conflict with recovery
DETAIL: User query might have needed to see row versions that must be
removed.
pg_dump: detail: Command was: COPY **** ( **** ) TO stdout;
pg_dumpall: error: pg_dump failed on database "**********", exiting
it is normal, BUT! exit code is 0! In my backup script tell OK to this
task. Monitoring tell ok, but backup is empty (

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Владимир Фролов (#1)
Re: pg_dumpall bug exit code 0 with fatal

=?UTF-8?B?0JLQu9Cw0LTQuNC80LjRgCDQpNGA0L7Qu9C+0LI=?= <frolov.vova@gmail.com> writes:

I have installed two nodes cluster- PostgreSQL 16.3 (Debian
16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian
12.2.0-14) 12.2.0, 64-bit with wal logical replication. When i run
pgdump_all - i have error -
pg_dump: error: Dumping the contents of table "*****" failed: PQgetResult()
failed.
pg_dump: detail: Error message from server: ERROR: canceling statement due
to conflict with recovery
DETAIL: User query might have needed to see row versions that must be
removed.
pg_dump: detail: Command was: COPY **** ( **** ) TO stdout;
pg_dumpall: error: pg_dump failed on database "**********", exiting
it is normal, BUT! exit code is 0!

That's pretty hard to believe. pg_dumpall emits that message only
here:

ret = runPgDump(dbname, create_opts);
if (ret != 0)
pg_fatal("pg_dump failed on database \"%s\", exiting", dbname);

and pg_fatal is defined here (logging.h):

/*
* A common shortcut: pg_log_error() and immediately exit(1).
*/
#define pg_fatal(...) do { \
pg_log_generic(PG_LOG_ERROR, PG_LOG_PRIMARY, __VA_ARGS__); \
exit(1); \
} while(0)

So the exit code definitely should have been 1. Maybe you have some
kind of wrapper around pg_dumpall that is failing to pass the exit
code through correctly?

regards, tom lane