[PATCH] Atomic pgrename on Windows

Started by Alexander Korotkovover 8 years ago25 messageshackers
Jump to latest
#1Alexander Korotkov
aekorotkov@gmail.com

Hi!

It's assumed in PostgreSQL codebase that pgrename atomically replaces
target file with source file even if target file is open and being read by
another process. And this assumption is true on Linux, but it's false on
Windows. MoveFileEx() triggers an error when target file is open (and
accordingly locked). Some our customers has been faced such errors while
operating heavily loaded PostgreSQL instance on Windows.

LOG could not rename temporary statistics file "pg_stat_tmp/global.tmp" to
"pg_stat_tmp/global.stat": Permission denied

I've managed to reproduce this situation. Reliable reproducing of this
issue required patch to PostgreSQL core. I've written
slow-read-statfiles.patch for artificial slowdown
of pgstat_read_statsfiles() – sleep 100 ms after each statfile entry. If
you run make-100-dbs.sql on patched version, and then few times execute

select pg_stat_get_tuples_inserted('t1'::regclass);

in psql, then you would likely get the error above on Windows.

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows –
ReplaceFile() does that. ReplaceFiles() requires target file to exist,
this is why we still need to call MoveFileEx() when it doesn't exist.

This patch is based on work of Victor Spirin who was asked by Postgres Pro
to research this problem.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

slow-read-statfiles.patchapplication/octet-stream; name=slow-read-statfiles.patchDownload+1-0
make-100-dbs.sqlapplication/octet-stream; name=make-100-dbs.sqlDownload
atomic-pgrename-windows-1.patchapplication/octet-stream; name=atomic-pgrename-windows-1.patchDownload+31-31
#2Michael Paquier
michael@paquier.xyz
In reply to: Alexander Korotkov (#1)
Re: [PATCH] Atomic pgrename on Windows

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows – ReplaceFile()
does that. ReplaceFiles() requires target file to exist, this is why we
still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

This patch is based on work of Victor Spirin who was asked by Postgres Pro
to research this problem.

Victor has no community account so his name cannot be registered as a
co-author of the patch. I have added your name though.
--
Michael

#3Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#2)
Re: [PATCH] Atomic pgrename on Windows

On 2017-11-28 09:47:45 +0900, Michael Paquier wrote:

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows – ReplaceFile()
does that. ReplaceFiles() requires target file to exist, this is why we
still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

That seems like a *seriously* bad idea. What if we crash inbetween the
unlink and the rename?

I'm confused about the need for this. Shouldn't normally
opening all files FILE_SHARE_DELETE take care of this? See
https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx
"Note Delete access allows both delete and rename operations."

Is there an external process active that doesn't set that flag? Are we
missing setting it somewhere?

Greetings,

Andres Freund

#4Craig Ringer
craig@2ndquadrant.com
In reply to: Alexander Korotkov (#1)
Re: [PATCH] Atomic pgrename on Windows

On 27 November 2017 at 14:28, Alexander Korotkov <a.korotkov@postgrespro.ru>
wrote:

Hi!

It's assumed in PostgreSQL codebase that pgrename atomically replaces
target file with source file even if target file is open and being read by
another process. And this assumption is true on Linux, but it's false on
Windows. MoveFileEx() triggers an error when target file is open (and
accordingly locked). Some our customers has been faced such errors while
operating heavily loaded PostgreSQL instance on Windows.

LOG could not rename temporary statistics file "pg_stat_tmp/global.tmp" to
"pg_stat_tmp/global.stat": Permission denied

That would explain a number of intermittent reports Iv'e seen floating
around.

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows –
ReplaceFile() does that. ReplaceFiles() requires target file to exist,
this is why we still need to call MoveFileEx() when it doesn't exist.

Look at the error codes for ReplaceFile:

https://msdn.microsoft.com/en-us/library/aa365512(VS.85).aspx

The docs don't say it's atomic and the error codes suggest it may not be.
But there's a Microsoft Research paper claiming it's atomic -
http://research.microsoft.com/pubs/64525/tr-2006-45.pdf .

It appears that MoveFileTransacted would be what we really want, when we're
on NTFS (it won't work on FAT32 or network shares). See
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365241(v=vs.85).aspx
. But the docs have a preface warning that it's not recommended and may not
be available in future Windows versions, so that's not an option.

This Go language bug (https://github.com/golang/go/issues/8914) and this
cPython discussion (http://bugs.python.org/issue8828)) have discussion.

I found this comment particularly illuminating
https://bugs.python.org/msg146307 as it quotes what Java does. It uses
MoveFileEx.

See also:

*
https://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows
*
https://blogs.msdn.microsoft.com/adioltean/2005/12/28/how-to-do-atomic-writes-in-a-file/
* https://msdn.microsoft.com/en-us/library/aa365512(VS.85).aspx
*
https://msdn.microsoft.com/en-us/library/windows/desktop/hh802690(v=vs.85).aspx

(I sincerely hope that that blog post about atomic writes, which is 12
years old, is obsoleted by some new functionality. Because seriously, WTF.)

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#5Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#3)
Re: [PATCH] Atomic pgrename on Windows

On Tue, Nov 28, 2017 at 3:59 AM, Andres Freund <andres@anarazel.de> wrote:

On 2017-11-28 09:47:45 +0900, Michael Paquier wrote:

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows –

ReplaceFile()

does that. ReplaceFiles() requires target file to exist, this is why

we

still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

That seems like a *seriously* bad idea. What if we crash inbetween the
unlink and the rename?

I'm confused about the need for this. Shouldn't normally
opening all files FILE_SHARE_DELETE take care of this? See
https://msdn.microsoft.com/en-us/library/windows/desktop/
aa363858(v=vs.85).aspx
"Note Delete access allows both delete and rename operations."

Is there an external process active that doesn't set that flag?

I'm quite sure there was no such processed during my experimentation. No
antivirus or other disturbers. Moreover, error reproduces only with
artificial delay in pgstat_read_statsfiles(). So, it's clearly related to
lock placed by this function.

Are we missing setting it somewhere?

That's curious, but at least pgstat_read_statsfiles() seems to open file
with that flag.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#6Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#5)
Re: [PATCH] Atomic pgrename on Windows

On Tue, Nov 28, 2017 at 5:52 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Tue, Nov 28, 2017 at 3:59 AM, Andres Freund <andres@anarazel.de> wrote:

On 2017-11-28 09:47:45 +0900, Michael Paquier wrote:

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem.

It

appears to be possible to atomically replace file on Windows –

ReplaceFile()

does that. ReplaceFiles() requires target file to exist, this is why

we

still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

That seems like a *seriously* bad idea. What if we crash inbetween the
unlink and the rename?

I'm confused about the need for this. Shouldn't normally
opening all files FILE_SHARE_DELETE take care of this? See
https://msdn.microsoft.com/en-us/library/windows/desktop/aa3
63858(v=vs.85).aspx
"Note Delete access allows both delete and rename operations."

Is there an external process active that doesn't set that flag?

I'm quite sure there was no such processed during my experimentation. No
antivirus or other disturbers. Moreover, error reproduces only with
artificial delay in pgstat_read_statsfiles(). So, it's clearly related to
lock placed by this function.

Are we missing setting it somewhere?

That's curious, but at least pgstat_read_statsfiles() seems to open file
with that flag.

I wrote same console program to verify that windows API behaves so, and
it's not something odd inside PostgreSQL. Source code is attached.

So, with FILE_SHARE_DELETE flag you really can delete opened file
concurrently.

Session #1
rename_test.exe Open 1.txt
Opened
Press any key to continue . . .
Session #2
rename_test.exe Delete 1.txt
DeleteFile success
Closed

And you can rename it concurrently.

Session #1
rename_test.exe Open 1.txt
Opened
Press any key to continue . . .
Session #2
rename_test.exe MoveFileEx 1.txt 2.txt
MoveFileEx successClosed
Closed

But you can't replace it concurrently with another file. So as msdn
states, you can either delete or rename opened file concurrently. But you
can't replace it...

Session #1
rename_test.exe Open 1.txt
Opened
Press any key to continue . . .
Session #2
rename_test.exe MoveFileEx 2.txt 1.txt
MoveFileEx error: 5
Closed

But ReplaceFile works OK. Temporary file lives until session #1 close the
file.

Session #1
rename_test.exe Open 1.txt
Opened
Press any key to continue . . .
Session #2
rename_test.exe ReplaceFile 2.txt 1.txt
ReplaceFile success
Closed

I didn't try MoveFileTransacted, because deprecated function doesn't seem
like an option.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

rename_test.cpptext/x-c++src; charset=US-ASCII; name=rename_test.cppDownload
#7Alexander Korotkov
aekorotkov@gmail.com
In reply to: Craig Ringer (#4)
Re: [PATCH] Atomic pgrename on Windows

On Tue, Nov 28, 2017 at 5:02 AM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 27 November 2017 at 14:28, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

It's assumed in PostgreSQL codebase that pgrename atomically replaces
target file with source file even if target file is open and being read by
another process. And this assumption is true on Linux, but it's false on
Windows. MoveFileEx() triggers an error when target file is open (and
accordingly locked). Some our customers has been faced such errors while
operating heavily loaded PostgreSQL instance on Windows.

LOG could not rename temporary statistics file "pg_stat_tmp/global.tmp"
to "pg_stat_tmp/global.stat": Permission denied

That would explain a number of intermittent reports Iv'e seen floating
around.

Yeah.

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It

appears to be possible to atomically replace file on Windows –
ReplaceFile() does that. ReplaceFiles() requires target file to exist,
this is why we still need to call MoveFileEx() when it doesn't exist.

Look at the error codes for ReplaceFile:

https://msdn.microsoft.com/en-us/library/aa365512(VS.85).aspx

The docs don't say it's atomic and the error codes suggest it may not be.
But there's a Microsoft Research paper claiming it's atomic -
http://research.microsoft.com/pubs/64525/tr-2006-45.pdf .

It appears that MoveFileTransacted would be what we really want, when
we're on NTFS (it won't work on FAT32 or network shares). See
https://msdn.microsoft.com/en-us/library/windows/
desktop/aa365241(v=vs.85).aspx . But the docs have a preface warning that
it's not recommended and may not be available in future Windows versions,
so that's not an option.

This Go language bug (https://github.com/golang/go/issues/8914) and this
cPython discussion (http://bugs.python.org/issue8828)) have discussion.

I found this comment particularly illuminating https://bugs.python.org/
msg146307 as it quotes what Java does. It uses MoveFileEx.

That's look bad. MoveFileTransacted() looks like best option, but it's not
an option since it's deprecated.
For now, ReplaceFile() function looks like best choice for renaming
statfiles. But it doesn't look good for critical datafiles whose are also
renamed using pgrename, because system can crash between rename of
destination file and rename of source file. Thus, MoveFileEx() seems still
best solution for critical datafiles where safety is more important than
concurrency. After reading provided links, I'm very suspicious about its
safety too. But it's seems like best available solution and it have
already passed some test of time...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#8Craig Ringer
craig@2ndquadrant.com
In reply to: Alexander Korotkov (#7)
Re: [PATCH] Atomic pgrename on Windows

On 29 November 2017 at 00:16, Alexander Korotkov <a.korotkov@postgrespro.ru>
wrote:

For now, ReplaceFile() function looks like best choice for renaming
statfiles. But it doesn't look good for critical datafiles whose are also
renamed using pgrename, because system can crash between rename of
destination file and rename of source file. Thus, MoveFileEx() seems still
best solution for critical datafiles where safety is more important than
concurrency. After reading provided links, I'm very suspicious about its
safety too. But it's seems like best available solution and it have
already passed some test of time...

Yeah.

Or alternately, we might need to extend WAL to include any temp file names,
etc, needed to allow us to recover from an interrupted operation during
redo. Frankly that may be the safest option; on Windows presume that there
is no atomic rename we can trust for critical files, and do it in multiple
steps.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#9Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#8)
Re: [PATCH] Atomic pgrename on Windows

On 2017-11-29 10:13:32 +0800, Craig Ringer wrote:

On 29 November 2017 at 00:16, Alexander Korotkov <a.korotkov@postgrespro.ru>
wrote:

For now, ReplaceFile() function looks like best choice for renaming
statfiles. But it doesn't look good for critical datafiles whose are also
renamed using pgrename, because system can crash between rename of
destination file and rename of source file. Thus, MoveFileEx() seems still
best solution for critical datafiles where safety is more important than
concurrency. After reading provided links, I'm very suspicious about its
safety too. But it's seems like best available solution and it have
already passed some test of time...

Or we can just add some retry logic like we have for opening files...

Or alternately, we might need to extend WAL to include any temp file names,
etc, needed to allow us to recover from an interrupted operation during
redo. Frankly that may be the safest option; on Windows presume that there
is no atomic rename we can trust for critical files, and do it in multiple
steps.

I'd rather remove windows support than go there.

Greetings,

Andres Freund

#10Stephen Frost
sfrost@snowman.net
In reply to: Alexander Korotkov (#1)
Re: [PATCH] Atomic pgrename on Windows

Greetings Alexander, all,

* Alexander Korotkov (a.korotkov@postgrespro.ru) wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows –
ReplaceFile() does that. ReplaceFiles() requires target file to exist,
this is why we still need to call MoveFileEx() when it doesn't exist.

There's been some discussion on this patch and it seems to, at least,
improve the situation on Windows even if it might not completely address
the issues. Does anyone else want to take a look and comment,
particularly those familiar with the Windows side of things? Otherwise,
I'm afraid this patch may end up just sitting in Needs review state fo
the entire CF and not getting moved forward, even though it seems like a
good improvement for our Windows users.

Thanks!

Stephen

#11Magnus Hagander
magnus@hagander.net
In reply to: Michael Paquier (#2)
Re: [PATCH] Atomic pgrename on Windows

On Tue, Nov 28, 2017 at 2:47 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows –

ReplaceFile()

does that. ReplaceFiles() requires target file to exist, this is why we
still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

Unlinking it first seems dangerous, as pointed out by Andres.

What about first trying ReplaceFile() and then if it fails with "target
doesn't exist", then call MoveFileEx().

Or the other way around -- try MoveFileEx() first since that seems to work
most of the time today (if it mostly broke we'd be in trouble already), and
if it fails with a sharing violation, try ReplaceFile()? And perhaps end up
doing it something similar to what we do with shared memory which is just
to loop over it and try each a couple of time, before giving up and
failing?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#12David Steele
david@pgmasters.net
In reply to: Magnus Hagander (#11)
Re: Re: [PATCH] Atomic pgrename on Windows

Hi Alexander,

On 1/20/18 10:13 AM, Magnus Hagander wrote:

On Tue, Nov 28, 2017 at 2:47 AM, Michael Paquier
<michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>> wrote:

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru <mailto:a.korotkov@postgrespro.ru>> wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem.  It
appears to be possible to atomically replace file on Windows – ReplaceFile()
does that.  ReplaceFiles() requires target file to exist, this is why we
still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

Unlinking it first seems dangerous, as pointed out by Andres.

What about first trying ReplaceFile() and then if it fails with "target
doesn't exist", then call MoveFileEx().

Or the other way around -- try MoveFileEx() first since that seems to
work most of the time today (if it mostly broke we'd be in trouble
already), and if it fails with a sharing violation, try ReplaceFile()?
And perhaps end up doing it something similar to what we do with shared
memory which is just to loop over it and try  each a couple of time,
before giving up and failing?

This patch was mistakenly left as Needs Review during the last
commitfest but it's pretty clear that a new patch is required.

This certainly sounds like a non-trivial change. Perhaps it should be
submitted for PG12?

Thanks,
--
-David
david@pgmasters.net

#13Alexander Korotkov
aekorotkov@gmail.com
In reply to: David Steele (#12)
Re: Re: [PATCH] Atomic pgrename on Windows

Hi, David!

On Tue, Mar 6, 2018 at 5:04 PM, David Steele <david@pgmasters.net> wrote:

On 1/20/18 10:13 AM, Magnus Hagander wrote:

On Tue, Nov 28, 2017 at 2:47 AM, Michael Paquier
<michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>> wrote:

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru <mailto:a.korotkov@postgrespro.ru>>

wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this

problem. It

appears to be possible to atomically replace file on Windows –

ReplaceFile()

does that. ReplaceFiles() requires target file to exist, this is

why we

still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

Unlinking it first seems dangerous, as pointed out by Andres.

What about first trying ReplaceFile() and then if it fails with "target
doesn't exist", then call MoveFileEx().

Or the other way around -- try MoveFileEx() first since that seems to
work most of the time today (if it mostly broke we'd be in trouble
already), and if it fails with a sharing violation, try ReplaceFile()?
And perhaps end up doing it something similar to what we do with shared
memory which is just to loop over it and try each a couple of time,
before giving up and failing?

This patch was mistakenly left as Needs Review during the last
commitfest but it's pretty clear that a new patch is required.

OK! No objections against marking this patch RWF.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#14David Steele
david@pgmasters.net
In reply to: Alexander Korotkov (#13)
Re: [PATCH] Atomic pgrename on Windows

On 3/6/18 9:06 AM, Alexander Korotkov wrote:

On Tue, Mar 6, 2018 at 5:04 PM, David Steele <david@pgmasters.net
<mailto:david@pgmasters.net>> wrote:

On 1/20/18 10:13 AM, Magnus Hagander wrote:

Unlinking it first seems dangerous, as pointed out by Andres.

What about first trying ReplaceFile() and then if it fails with "target
doesn't exist", then call MoveFileEx().

Or the other way around -- try MoveFileEx() first since that seems to
work most of the time today (if it mostly broke we'd be in trouble
already), and if it fails with a sharing violation, try ReplaceFile()?
And perhaps end up doing it something similar to what we do with shared
memory which is just to loop over it and try  each a couple of time,
before giving up and failing?

This patch was mistakenly left as Needs Review during the last
commitfest but it's pretty clear that a new patch is required.

OK!  No objections against marking this patch RWF.

Hmmm, I just noticed this categorized as a bug. I thought it was a
refactor.

Even so, it looks like the approach needs a rethink so better to wait
for that.

Marked Returned with Feedback.

Thanks,
--
-David
david@pgmasters.net

#15Alexander Korotkov
aekorotkov@gmail.com
In reply to: David Steele (#14)
Re: [PATCH] Atomic pgrename on Windows

On Tue, Mar 6, 2018 at 5:11 PM, David Steele <david@pgmasters.net> wrote:

On 3/6/18 9:06 AM, Alexander Korotkov wrote:

On Tue, Mar 6, 2018 at 5:04 PM, David Steele <david@pgmasters.net
<mailto:david@pgmasters.net>> wrote:

On 1/20/18 10:13 AM, Magnus Hagander wrote:

Unlinking it first seems dangerous, as pointed out by Andres.

What about first trying ReplaceFile() and then if it fails with

"target

doesn't exist", then call MoveFileEx().

Or the other way around -- try MoveFileEx() first since that seems

to

work most of the time today (if it mostly broke we'd be in trouble
already), and if it fails with a sharing violation, try

ReplaceFile()?

And perhaps end up doing it something similar to what we do with

shared

memory which is just to loop over it and try each a couple of

time,

before giving up and failing?

This patch was mistakenly left as Needs Review during the last
commitfest but it's pretty clear that a new patch is required.

OK! No objections against marking this patch RWF.

Hmmm, I just noticed this categorized as a bug. I thought it was a
refactor.

Yes, that's naturally a bug. Not very critical though.

Even so, it looks like the approach needs a rethink so better to wait

for that.

In this thread we've found at least two possible approaches to fix this
bug. But both of them need to be implemented and tested.

Marked Returned with Feedback.

OK!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#16Alexander Korotkov
aekorotkov@gmail.com
In reply to: Magnus Hagander (#11)
Re: [PATCH] Atomic pgrename on Windows

Hi!

I'd like to resume the discussion on this subject. Sorry for so long delay.

On Sat, Jan 20, 2018 at 6:13 PM Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Nov 28, 2017 at 2:47 AM, Michael Paquier <michael.paquier@gmail.com> wrote:

On Mon, Nov 27, 2017 at 3:28 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
appears to be possible to atomically replace file on Windows – ReplaceFile()
does that. ReplaceFiles() requires target file to exist, this is why we
still need to call MoveFileEx() when it doesn't exist.

Do you think that it could be safer to unlink the target file first
with pgunlink()? This way you make sure that the target file is
removed and not locked. This change makes me worrying about the
introduction of more race conditions.

Unlinking it first seems dangerous, as pointed out by Andres.

What about first trying ReplaceFile() and then if it fails with "target doesn't exist", then call MoveFileEx().

Or the other way around -- try MoveFileEx() first since that seems to work most of the time today (if it mostly broke we'd be in trouble already), and if it fails with a sharing violation, try ReplaceFile()? And perhaps end up doing it something similar to what we do with shared memory which is just to loop over it and try each a couple of time, before giving up and failing?

Unfortunately, it seems that none of such strategies would fit all the cases.

Basically, we have two option for renaming a file.
* MoveFileEx() – safest possible option, less likely loose files, but
takes a lock on target.
* ReplaceFile() – "lockless" option, but may loose files on OS crash.

Also we have two different cases of files renaming:
1) Renaming durable files. When target already exists, on OS crash we
expect finding either old or new file in target filename. We don't
expect to find nothing in target filename.
2) Renaming temporary files. In this case we don't much care on
loosing files on OS crash. But locking appears to be an issue in some
cases.

So, in 1st case it doesn't seem like an option to try ReplaceFile()
either in first or second place. In both ways it causes a risk to
loose a target file, which seems unacceptable.

In the 2nd case, trying MoveFileEx() then ReplaceFile() or vise versa
might work. But error codes of these functions doesn't tell
explicitly whether target file exists. So, I prefer checking it
explicitly with GetFileAttributes().

Attached patch implements idea described in [1]. It implements
separate pgrename_temp() function, which is better for concurrent
operation but less safe.

Links
1. /messages/by-id/CAPpHfds9trA6ipezK3BsuuOSQwEmESiqj8pkOxACFJpoLpcoNw@mail.gmail.com

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

pgrename_temp-1.patchapplication/octet-stream; name=pgrename_temp-1.patchDownload+99-4
#17Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Alexander Korotkov (#16)
Re: [PATCH] Atomic pgrename on Windows

On Tue, Aug 06, 2019 at 09:03:08PM +0300, Alexander Korotkov wrote:

...

Unfortunately, it seems that none of such strategies would fit all the
cases.

Basically, we have two option for renaming a file. * MoveFileEx() –
safest possible option, less likely loose files, but takes a lock on
target. * ReplaceFile() – "lockless" option, but may loose files on
OS crash.

Also we have two different cases of files renaming: 1) Renaming
durable files. When target already exists, on OS crash we expect
finding either old or new file in target filename. We don't expect to
find nothing in target filename. 2) Renaming temporary files. In
this case we don't much care on loosing files on OS crash. But
locking appears to be an issue in some cases.

So, in 1st case it doesn't seem like an option to try ReplaceFile()
either in first or second place. In both ways it causes a risk to
loose a target file, which seems unacceptable.

In the 2nd case, trying MoveFileEx() then ReplaceFile() or vise versa
might work. But error codes of these functions doesn't tell explicitly
whether target file exists. So, I prefer checking it explicitly with
GetFileAttributes().

Attached patch implements idea described in [1]. It implements
separate pgrename_temp() function, which is better for concurrent
operation but less safe.

I don't have access to a Windows machine and my developer experience
with that platform is pretty much nil, but I think this patch makes
sense. It's not an ideal solution, but it's not clear such solution
exists, and an improvement is better than nothing.

I have two minor comments about rename_temp:

1) The name might seem to be hinting it's about files opened using
OpenTemporaryFile, but in practice it's about files that are not
critical. But maybe it's true.

2) I think the rename_temp comment should mention it can only be used in
cases when the renames happen in a single process (non-concurrently). If
we could call rename_temp() concurrently from two different processes,
it won't work as expected. Of course, we only call rename_temp from stat
collector and syslogger, where it obviously works.

Anyway, I'm really nitpicking here ...

Are there any objections to get the current patch committed as is, so
that it does not get pushed yet again to the next commitfest.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#18Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tomas Vondra (#17)
Re: [PATCH] Atomic pgrename on Windows

Hi!

On Tue, Jan 7, 2020 at 7:16 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I don't have access to a Windows machine and my developer experience
with that platform is pretty much nil, but I think this patch makes
sense. It's not an ideal solution, but it's not clear such solution
exists, and an improvement is better than nothing.

Thank you for your attention to this patch!

I have two minor comments about rename_temp:

1) The name might seem to be hinting it's about files opened using
OpenTemporaryFile, but in practice it's about files that are not
critical. But maybe it's true.

We may invent another name. What about rename_fragile()?

2) I think the rename_temp comment should mention it can only be used in
cases when the renames happen in a single process (non-concurrently). If
we could call rename_temp() concurrently from two different processes,
it won't work as expected. Of course, we only call rename_temp from stat
collector and syslogger, where it obviously works.

Good point, this should be reflected in comments.

Anyway, I'm really nitpicking here ...

Are there any objections to get the current patch committed as is, so
that it does not get pushed yet again to the next commitfest.

It would be good to commit. Let's get agreement on function name first.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Korotkov (#18)
Re: [PATCH] Atomic pgrename on Windows

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

On Tue, Jan 7, 2020 at 7:16 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I don't have access to a Windows machine and my developer experience
with that platform is pretty much nil, but I think this patch makes
sense. It's not an ideal solution, but it's not clear such solution
exists, and an improvement is better than nothing.

Thank you for your attention to this patch!

FWIW, I don't like this patch much at all. I too know nothing about
Windows, but I do *not* like inventing a distinction between "rename"
and "rename_temp" and expecting all call sites to have to decide
which one to use. That's allowing a single platform's implementation
bugs to dictate our APIs globally; plus it's not clear that every
call site can know which is more appropriate.

regards, tom lane

#20Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tom Lane (#19)
Re: [PATCH] Atomic pgrename on Windows

On Tue, Jan 7, 2020 at 9:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

On Tue, Jan 7, 2020 at 7:16 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I don't have access to a Windows machine and my developer experience
with that platform is pretty much nil, but I think this patch makes
sense. It's not an ideal solution, but it's not clear such solution
exists, and an improvement is better than nothing.

Thank you for your attention to this patch!

FWIW, I don't like this patch much at all. I too know nothing about
Windows, but I do *not* like inventing a distinction between "rename"
and "rename_temp" and expecting all call sites to have to decide
which one to use. That's allowing a single platform's implementation
bugs to dictate our APIs globally; plus it's not clear that every
call site can know which is more appropriate.

I'm not sure issue we faced is really about single platform. TBH, the
assumptions we place to rename function is very strict. We assume
rename works atomically on system crash. And we indirectly assume it
can work concurrently as least with file readers. The both are
probably true for Linux with most popular filesystems. But we do
support pretty many platforms. I think the issue we didn't
investigate rename properties well on all of them. But if we do, it
might happen some assumptions are wrong on multiple platforms.
Windows is just used busy enough to spot the problem.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Korotkov (#20)
#22Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tom Lane (#21)
#23David Steele
david@pgmasters.net
In reply to: Alexander Korotkov (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Steele (#23)
#25Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#24)