hash join error improvement (old)

Started by Alvaro Herreraalmost 6 years ago8 messageshackers
Jump to latest
#1Alvaro Herrera
alvherre@2ndquadrant.com

I recently noticed this in a customer log file:
ERROR: could not read from hash-join temporary file: Success

The problem is we're reporting with %m when the problem is a partial
read or write.

I propose the attached patch to solve it: report "wrote only X of X
bytes". This caused a lot of other trouble, the cause of which I've
been unable to pinpoint as yet. But in the meantime, this is already a
small improvement.

--
�lvaro Herrera

Attachments:

hashjoin.patchtext/x-diff; charset=us-asciiDownload+24-4
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: hash join error improvement (old)

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I recently noticed this in a customer log file:
ERROR: could not read from hash-join temporary file: Success
The problem is we're reporting with %m when the problem is a partial
read or write.
I propose the attached patch to solve it: report "wrote only X of X
bytes". This caused a lot of other trouble, the cause of which I've
been unable to pinpoint as yet. But in the meantime, this is already a
small improvement.

+1 for the idea, but there's still one small problem with what you did
here: errcode_for_file_access() looks at errno, which we can presume
will not have a relevant value in the "wrote only %d bytes" paths.

Most places that are taking pains to deal with this scenario
do something like

errno = 0;
if (write(fd, data, len, xlrec->offset) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", path)));
}

I don't mind if you want to extend that paradigm to also use "wrote only
%d bytes" wording, but the important point is to get the SQLSTATE set on
the basis of ENOSPC rather than whatever random value errno will have
otherwise.

regards, tom lane

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: hash join error improvement (old)

Hi Tom, thanks for looking.

On 2020-May-25, Tom Lane wrote:

I don't mind if you want to extend that paradigm to also use "wrote only
%d bytes" wording, but the important point is to get the SQLSTATE set on
the basis of ENOSPC rather than whatever random value errno will have
otherwise.

Hmm, right -- I was extending the partial read case to apply to a
partial write, and we deal with those very differently. I changed the
write case to use our standard approach.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

hashjoin.patchtext/x-diff; charset=us-asciiDownload+24-4
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: hash join error improvement (old)

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Hmm, right -- I was extending the partial read case to apply to a
partial write, and we deal with those very differently. I changed the
write case to use our standard approach.

Actually ... looking more closely, this proposed change in
ExecHashJoinSaveTuple flat out doesn't work, because it assumes that
BufFileWrite reports errors the same way as write(), which is not the
case. In particular, written < 0 can't happen; moreover, you've
removed detection of a short write as opposed to a completely failed
write.

Digging further down, it looks like BufFileWrite calls BufFileDumpBuffer
which calls FileWrite which takes pains to set errno correctly after a
short write --- so other than the lack of commentary about these
functions' error-reporting API, I don't think there's any actual bug here.
Are you sure you correctly identified the source of the bogus error
report?

Similarly, I'm afraid you introduced rather than removed problems
in ExecHashJoinGetSavedTuple. BufFileRead doesn't use negative
return values either.

regards, tom lane

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: hash join error improvement (old)

On 2020-May-26, Tom Lane wrote:

Digging further down, it looks like BufFileWrite calls BufFileDumpBuffer
which calls FileWrite which takes pains to set errno correctly after a
short write --- so other than the lack of commentary about these
functions' error-reporting API, I don't think there's any actual bug here.

Doh, you're right, this patch is completely broken ... aside from
carelessly writing the wrong "if" test, my unfamiliarity with the stdio
fread/fwrite interface is showing. I'll look more carefully.

Are you sure you correctly identified the source of the bogus error
report?

Nope. And I wish the bogus error report was all there was to it. The
actual problem is a server crash.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: hash join error improvement (old)

On 2020-May-26, Tom Lane wrote:

Are you sure you correctly identified the source of the bogus error
report?

This version's better. It doesn't touch the write side at all.
On the read side, only report a short read as such if errno's not set.

This error isn't frequently seen. This page
https://blog.csdn.net/pg_hgdb/article/details/106279303
(A Postgres fork; blames the error on the temp hash files being encrypted,
suggests to increase temp_buffers) is the only one I found.

There are more uses of BufFileRead that don't bother to distinguish
these two cases apart, though -- logtape.c, tuplestore.c,
gistbuildbuffers.c all do the same.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

hashjoin-3.patchtext/x-diff; charset=us-asciiDownload+22-6
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
Re: hash join error improvement (old)

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

There are more uses of BufFileRead that don't bother to distinguish
these two cases apart, though -- logtape.c, tuplestore.c,
gistbuildbuffers.c all do the same.

Yeah. I rather suspect that callers of BufFileRead/Write are mostly
expecting that those functions will throw an ereport() for any interesting
error condition. Maybe we should make it so, instead of piecemeal fixing
the callers?

regards, tom lane

#8Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#7)
Re: hash join error improvement (old)

On Wed, May 27, 2020 at 1:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

There are more uses of BufFileRead that don't bother to distinguish
these two cases apart, though -- logtape.c, tuplestore.c,
gistbuildbuffers.c all do the same.

Yeah. I rather suspect that callers of BufFileRead/Write are mostly
expecting that those functions will throw an ereport() for any interesting
error condition. Maybe we should make it so, instead of piecemeal fixing
the callers?

Yeah. I proposed that over here:

/messages/by-id/CA+hUKGK0w+GTs8aDvsKDVu7cFzSE5q+0NP_9kPSxg2NA1NeZew@mail.gmail.com

But I got stuck trying to figure out whether to back-patch (arguably
yes: there are bugs here, but arguably no: the interfaces change).