unexplainable psql exit status 1

Started by Tim Kaneover 12 years ago5 messagesgeneral
Jump to latest
#1Tim Kane
tim.kane@gmail.com

HI all,

Is anyone aware of cases where psql will occasionally return an exit status
of 1, despite there being no obvious error condition?

I have a regular (daily) cron that executes a fairly simple script of the
form:

psql --set ON_ERROR_STOP=1 -h $DB_HOST -U $DB_USER -a -v DATE="'${DATE}'" -v
YYYYMM="${DATE_YYYYMM}" -f duplicate_removal.sql $DB_NAME >> $DB_LOG 2>&1

The execution of this duplicate_removal.sql script appears to run without
error. It’s fairly simple, and never fails to execute in and of itself.
However, on very rare occasion, the psql session will run to completion,
without visible/logged error – but it returns an exit status 1.

The only thing close to an error coming from this script is a notice,
resulting from a CREATE TABLE IF NOT EXISTS. This is no different from any
other execution of the script.

NOTICE: relation "discarded" already exists, skipping

I can see that the output is correctly redirected to $DB_LOG and I can see
the successful COMMIT response returned at the end of the session. I can
also verify that the behaviour of the script is as expected, to completion.

The psql man page suggests that an exit status of 1 is a fatal error of its
own.

EXIT STATUS
psql returns 0 to the shell if it finished normally, 1 if a fatal error
of its own occurs (e.g. out of memory, file not found), 2 if the connection to
the server went bad and the session was not
interactive, and 3 if an error occurred in a script and the variable
ON_ERROR_STOP was set.

There is no indication in the postgres log file of anything adverse.

At this point my only option is to set log_statement = all, enable shell
debugging, and wait a few months for it to happen again (I’ve seen this
occur only twice, over a 2-3 month period). Has anyone encountered this
behaviour before?

Thanks

Tim

#2Tim Kane
tim.kane@gmail.com
In reply to: Tim Kane (#1)
Re: unexplainable psql exit status 1

I should have mentioned, psql client and server version 9.2.4

From: Tim Kane <tim.kane@gmail.com>
Date: Wednesday, 4 December 2013 10:59
To: pgsql-general General <pgsql-general@postgresql.org>
Subject: unexplainable psql exit status 1

HI all,

Is anyone aware of cases where psql will occasionally return an exit status
of 1, despite there being no obvious error condition?

I have a regular (daily) cron that executes a fairly simple script of the
form:

psql --set ON_ERROR_STOP=1 -h $DB_HOST -U $DB_USER -a -v DATE="'${DATE}'" -v
YYYYMM="${DATE_YYYYMM}" -f duplicate_removal.sql $DB_NAME >> $DB_LOG 2>&1

The execution of this duplicate_removal.sql script appears to run without
error. It’s fairly simple, and never fails to execute in and of itself.
However, on very rare occasion, the psql session will run to completion,
without visible/logged error – but it returns an exit status 1.

The only thing close to an error coming from this script is a notice,
resulting from a CREATE TABLE IF NOT EXISTS. This is no different from any
other execution of the script.

NOTICE: relation "discarded" already exists, skipping

I can see that the output is correctly redirected to $DB_LOG and I can see
the successful COMMIT response returned at the end of the session. I can
also verify that the behaviour of the script is as expected, to completion.

The psql man page suggests that an exit status of 1 is a fatal error of its
own.

EXIT STATUS
psql returns 0 to the shell if it finished normally, 1 if a fatal error
of its own occurs (e.g. out of memory, file not found), 2 if the connection to
the server went bad and the session was not
interactive, and 3 if an error occurred in a script and the variable
ON_ERROR_STOP was set.

There is no indication in the postgres log file of anything adverse.

At this point my only option is to set log_statement = all, enable shell
debugging, and wait a few months for it to happen again (I’ve seen this
occur only twice, over a 2-3 month period). Has anyone encountered this
behaviour before?

Thanks

Tim

#3Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tim Kane (#1)
Re: unexplainable psql exit status 1

Tim Kane wrote:

Is anyone aware of cases where psql will occasionally return an exit status of 1, despite there being
no obvious error condition?

I have a regular (daily) cron that executes a fairly simple script of the form:

psql --set ON_ERROR_STOP=1 -h $DB_HOST -U $DB_USER -a -v DATE="'${DATE}'" -v
YYYYMM="${DATE_YYYYMM}" -f duplicate_removal.sql $DB_NAME >> $DB_LOG 2>&1

The execution of this duplicate_removal.sql script appears to run without error. It’s fairly simple,
and never fails to execute in and of itself.
However, on very rare occasion, the psql session will run to completion, without visible/logged error
– but it returns an exit status 1.

I can see that the output is correctly redirected to $DB_LOG and I can see the successful COMMIT
response returned at the end of the session. I can also verify that the behaviour of the script is as
expected, to completion.

The psql man page suggests that an exit status of 1 is a fatal error of its own.

EXIT STATUS
psql returns 0 to the shell if it finished normally, 1 if a fatal error of its own occurs
(e.g. out of memory, file not found), 2 if the connection to the server went bad and
the session was not interactive, and 3 if an error occurred in a script and the variable
ON_ERROR_STOP was set.

There is no indication in the postgres log file of anything adverse.

At this point my only option is to set log_statement = all, enable shell debugging, and wait a few
months for it to happen again (I’ve seen this occur only twice, over a 2-3 month period). Has anyone
encountered this behaviour before?

According to the documentation you quoted, it would be useless to set
log_statement=all, because the problem is on the client end.

Shell debugging sounds slightly more promising, but it will probably
not tell you much more than that psql returned with exit code 1.

A quick look at the code gave me the impression that psql will always
write an error message before exiting with 1, but I may have missed a case.

Are you sure that there is nothing on stdout or stderr in the cases where
psql returns 1?

What is the last psql command that is executed?

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Tim Kane
tim.kane@gmail.com
In reply to: Laurenz Albe (#3)
Re: unexplainable psql exit status 1

According to the documentation you quoted, it would be useless to set
log_statement=all, because the problem is on the client end.

Shell debugging sounds slightly more promising, but it will probably
not tell you much more than that psql returned with exit code 1.

A quick look at the code gave me the impression that psql will always
write an error message before exiting with 1, but I may have missed a case.

Are you sure that there is nothing on stdout or stderr in the cases where
psql returns 1?

What is the last psql command that is executed?

Yours,
Laurenz Albe

Thanks Laurenz.
I’ve enabled shell debugging to be 100% sure that I’m not munging the return
code anywhere. It’s entirely possible there is something going on at the
shell side of things, though I fail to see how just at the minute :)
The output of the script is as follows, bearing in mind that all STDERR is
being redirected to STDOUT – there should be no other output to be found.
I’ve verified stdout just in case, to no avail.

\set QUIET 1
\timing
\set HISTSIZE 2000
\unset QUIET
DISCARD ALL;
DISCARD ALL
Time: 0.321 ms
—-SELECT STATEMENT SNIPPED
SELECT 7713
Time: 22.081 ms
CREATE TABLE IF NOT EXISTS duplicates(like source_table);
psql:/path/to/duplicate_removal.sql:36: NOTICE: relation "duplicates"
already exists, skipping
CREATE TABLE
Time: 0.416 ms
BEGIN;
BEGIN
Time: 0.210 ms
—-INSERT STATEMENT SNIPPED
INSERT 0 7713
Time: 93.855 ms

-—DELETE STATEMENT SNIPPED
DELETE 7713
Time: 33.939 ms

COMMIT;
COMMIT
Time: 0.242 ms

Cheers,

Tim

#5Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tim Kane (#4)
Re: unexplainable psql exit status 1

Tim Kane wrote:

I’ve enabled shell debugging to be 100% sure that I’m not munging the return code anywhere. It’s
entirely possible there is something going on at the shell side of things, though I fail to see how
just at the minute :)
The output of the script is as follows, bearing in mind that all STDERR is being redirected to STDOUT
– there should be no other output to be found.
I’ve verified stdout just in case, to no avail.

[...]

COMMIT;
COMMIT
Time: 0.242 ms

Doesn't look like that could cause return code 1 ...

Just a random idea: are there any weird characters at
the end of your SQL script?
But that wouldn't explain why psql sometimes emits
return code 1 and sometimes not ...

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general