Fix handling of copy_file_range() return value

Started by Peter Eisentraut1 day ago4 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

While checking return/error handling of file system calls, I found that
the copy_file_range() call in pg_combinebackup has a potential problem.
If copy_file_range() returns 0, which is a documented condition, then
the loop never makes progress and could spin forever.

The other uses of copy_file_range() in the tree are surrounded by
different logic and don't appear to have this problem.

My suggested fix is to make a return value of 0 an error. It most
likely indicates that the source file has an unexpected size.

Attachments:

0001-Fix-handling-of-copy_file_range-return-value.patchtext/plain; charset=UTF-8; name=0001-Fix-handling-of-copy_file_range-return-value.patchDownload+3-1
#2Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Peter Eisentraut (#1)
Re: Fix handling of copy_file_range() return value

Hi,

On Mon, 22 Jun 2026 at 10:20, Peter Eisentraut <peter@eisentraut.org> wrote:

While checking return/error handling of file system calls, I found that
the copy_file_range() call in pg_combinebackup has a potential problem.
If copy_file_range() returns 0, which is a documented condition, then
the loop never makes progress and could spin forever.

The other uses of copy_file_range() in the tree are surrounded by
different logic and don't appear to have this problem.

My suggested fix is to make a return value of 0 an error. It most
likely indicates that the source file has an unexpected size.

You are right, that is a problem only with this use of
copy_file_range(), and your patch fixes it; LGTM.

--
Regards,
Nazir Bilal Yavuz
Microsoft

#3Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Peter Eisentraut (#1)
Re: Fix handling of copy_file_range() return value

At Mon, 22 Jun 2026 09:19:59 +0200, Peter Eisentraut <peter@eisentraut.org> wrote in

While checking return/error handling of file system calls, I found
that the copy_file_range() call in pg_combinebackup has a potential
problem. If copy_file_range() returns 0, which is a documented
condition, then the loop never makes progress and could spin forever.

The other uses of copy_file_range() in the tree are surrounded by
different logic and don't appear to have this problem.

My suggested fix is to make a return value of 0 an error. It most
likely indicates that the source file has an unexpected size.

Good catch. I agree with the analysis, and the proposed fix looks
reasonable to me.

I also checked the other four uses of copy_file_range() in the
tree. All but one handle a zero return value as EOF, and the remaining
case (check_copy_file_range()) appears safe because it is only testing
whether copy_file_range() is usable.

Regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#4Yingying Chen
cyy9255@gmail.com
In reply to: Peter Eisentraut (#1)
Re: Fix handling of copy_file_range() return value

On Mon, Jun 22, 2026 at 3:20 PM Peter Eisentraut <peter@eisentraut.org>
wrote:

While checking return/error handling of file system calls, I found that
the copy_file_range() call in pg_combinebackup has a potential problem.
If copy_file_range() returns 0, which is a documented condition, then
the loop never makes progress and could spin forever.

The other uses of copy_file_range() in the tree are surrounded by
different logic and don't appear to have this problem.

My suggested fix is to make a return value of 0 an error. It most
likely indicates that the source file has an unexpected size.

Hi, Peter,

Thanks for the patch. I agree wb==0 should be treated as an error, so the
fix direction looks good to me.

I have a small comment:
====
wb = copy_file_range(s->fd, &off, wfd, NULL, BLCKSZ - nwritten, 0);

if (wb < 0)
pg_fatal("error while copying file range from \"%s\" to
\"%s\": %m",
s->filename, output_filename);
else if (wb == 0)
pg_fatal("unexpected end of file while copying file range
from \"%s\" to \"%s\"",
s->filename, output_filename);

As copy_file_range copies from s->fd, should pg_fatal just s->filename in
the error message? In that case, input_filename is no longer used in
write_reconstructed_file(), then it might be removed from the argument list.

Regards,
Yingying Chen