PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Started by Craig Ringerabout 8 years ago182 messageshackers
Jump to latest
#1Craig Ringer
craig@2ndquadrant.com

Hi all

Some time ago I ran into an issue where a user encountered data corruption
after a storage error. PostgreSQL played a part in that corruption by
allowing checkpoint what should've been a fatal error.

TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK at
least on Linux. When fsync() returns success it means "all writes since the
last fsync have hit disk" but we assume it means "all writes since the last
SUCCESSFUL fsync have hit disk".

Pg wrote some blocks, which went to OS dirty buffers for writeback.
Writeback failed due to an underlying storage error. The block I/O layer
and XFS marked the writeback page as failed (AS_EIO), but had no way to
tell the app about the failure. When Pg called fsync() on the FD during the
next checkpoint, fsync() returned EIO because of the flagged page, to tell
Pg that a previous async write failed. Pg treated the checkpoint as failed
and didn't advance the redo start position in the control file.

All good so far.

But then we retried the checkpoint, which retried the fsync(). The retry
succeeded, because the prior fsync() *cleared the AS_EIO bad page flag*.

The write never made it to disk, but we completed the checkpoint, and
merrily carried on our way. Whoops, data loss.

The clear-error-and-continue behaviour of fsync is not documented as far as
I can tell. Nor is fsync() returning EIO unless you have a very new linux
man-pages with the patch I wrote to add it. But from what I can see in the
POSIX standard we are not given any guarantees about what happens on
fsync() failure at all, so we're probably wrong to assume that retrying
fsync( ) is safe.

If the server had been using ext3 or ext4 with errors=remount-ro, the
problem wouldn't have occurred because the first I/O error would've
remounted the FS and stopped Pg from continuing. But XFS doesn't have that
option. There may be other situations where this can occur too, involving
LVM and/or multipath, but I haven't comprehensively dug out the details yet.

It proved possible to recover the system by faking up a backup label from
before the first incorrectly-successful checkpoint, forcing redo to repeat
and write the lost blocks. But ... what a mess.

I posted about the underlying fsync issue here some time ago:

https://stackoverflow.com/q/42434872/398670

but haven't had a chance to follow up about the Pg specifics.

I've been looking at the problem on and off and haven't come up with a good
answer. I think we should just PANIC and let redo sort it out by repeating
the failed write when it repeats work since the last checkpoint.

The API offered by async buffered writes and fsync offers us no way to find
out which page failed, so we can't just selectively redo that write. I
think we do know the relfilenode associated with the fd that failed to
fsync, but not much more. So the alternative seems to be some sort of
potentially complex online-redo scheme where we replay WAL only the
relation on which we had the fsync() error, while otherwise servicing
queries normally. That's likely to be extremely error-prone and hard to
test, and it's trying to solve a case where on other filesystems the whole
DB would grind to a halt anyway.

I looked into whether we can solve it with use of the AIO API instead, but
the mess is even worse there - from what I can tell you can't even reliably
guarantee fsync at all on all Linux kernel versions.

We already PANIC on fsync() failure for WAL segments. We just need to do
the same for data forks at least for EIO. This isn't as bad as it seems
because AFAICS fsync only returns EIO in cases where we should be stopping
the world anyway, and many FSes will do that for us.

There are rather a lot of pg_fsync() callers. While we could handle this
case-by-case for each one, I'm tempted to just make pg_fsync() itself
intercept EIO and PANIC. Thoughts?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#1)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

Retrying fsync() is not OK at
least on Linux. When fsync() returns success it means "all writes since the
last fsync have hit disk" but we assume it means "all writes since the last
SUCCESSFUL fsync have hit disk".

If that's actually the case, we need to push back on this kernel brain
damage, because as you're describing it fsync would be completely useless.

Moreover, POSIX is entirely clear that successful fsync means all
preceding writes for the file have been completed, full stop, doesn't
matter when they were issued.

regards, tom lane

#3Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#2)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

Any callers of pg_fsync in the backend code are careful enough to check
the returned status, sometimes doing retries like in mdsync, so what is
proposed here would be a regression.
--
Michael

#4Thomas Munro
thomas.munro@gmail.com
In reply to: Michael Paquier (#3)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Thu, Mar 29, 2018 at 3:30 PM, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

Any callers of pg_fsync in the backend code are careful enough to check
the returned status, sometimes doing retries like in mdsync, so what is
proposed here would be a regression.

Craig, is the phenomenon you described the same as the second issue
"Reporting writeback errors" discussed in this article?

https://lwn.net/Articles/724307/

"Current kernels might report a writeback error on an fsync() call,
but there are a number of ways in which that can fail to happen."

That's... I'm speechless.

--
Thomas Munro
http://www.enterprisedb.com

#5Justin Pryzby
pryzby@telsasoft.com
In reply to: Thomas Munro (#4)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Thu, Mar 29, 2018 at 11:30:59AM +0900, Michael Paquier wrote:

On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

Any callers of pg_fsync in the backend code are careful enough to check
the returned status, sometimes doing retries like in mdsync, so what is
proposed here would be a regression.

The retries are the source of the problem ; the first fsync() can return EIO,
and also *clears the error* causing a 2nd fsync (of the same data) to return
success.

(Note, I can see that it might be useful to PANIC on EIO but retry for ENOSPC).

On Thu, Mar 29, 2018 at 03:48:27PM +1300, Thomas Munro wrote:

Craig, is the phenomenon you described the same as the second issue
"Reporting writeback errors" discussed in this article?
https://lwn.net/Articles/724307/

Worse, the article acknowledges the behavior without apparently suggesting to
change it:

"Storing that value in the file structure has an important benefit: it makes
it possible to report a writeback error EXACTLY ONCE TO EVERY PROCESS THAT
CALLS FSYNC() .... In current kernels, ONLY THE FIRST CALLER AFTER AN ERROR
OCCURS HAS A CHANCE OF SEEING THAT ERROR INFORMATION."

I believe I reproduced the problem behavior using dmsetup "error" target, see
attached.

strace looks like this:

kernel is Linux 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

1 open("/dev/mapper/eio", O_RDWR|O_CREAT, 0600) = 3
2 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
3 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
4 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
5 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
6 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
7 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
8 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 2560
9 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = -1 ENOSPC (No space left on device)
10 dup(2) = 4
11 fcntl(4, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
12 brk(NULL) = 0x1299000
13 brk(0x12ba000) = 0x12ba000
14 fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
15 write(4, "write(1): No space left on devic"..., 34write(1): No space left on device
16 ) = 34
17 close(4) = 0
18 fsync(3) = -1 EIO (Input/output error)
19 dup(2) = 4
20 fcntl(4, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
21 fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
22 write(4, "fsync(1): Input/output error\n", 29fsync(1): Input/output error
23 ) = 29
24 close(4) = 0
25 close(3) = 0
26 open("/dev/mapper/eio", O_RDWR|O_CREAT, 0600) = 3
27 fsync(3) = 0
28 write(3, "\0", 1) = 1
29 fsync(3) = 0
30 exit_group(0) = ?

2: EIO isn't seen initially due to writeback page cache;
9: ENOSPC due to small device
18: original IO error reported by fsync, good
25: the original FD is closed
26: ..and file reopened
27: fsync on file with still-dirty data+EIO returns success BAD

10, 19: I'm not sure why there's dup(2), I guess glibc thinks that perror
should write to a separate FD (?)

Also note, close() ALSO returned success..which you might think exonerates the
2nd fsync(), but I think may itself be problematic, no? In any case, the 2nd
byte certainly never got written to DM error, and the failure status was lost
following fsync().

I get the exact same behavior if I break after one write() loop, such as to
avoid ENOSPC.

Justin

Attachments:

eio.ctext/x-csrc; charset=us-asciiDownload
#6Thomas Munro
thomas.munro@gmail.com
In reply to: Justin Pryzby (#5)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:

The retries are the source of the problem ; the first fsync() can return EIO,
and also *clears the error* causing a 2nd fsync (of the same data) to return
success.

What I'm failing to grok here is how that error flag even matters,
whether it's a single bit or a counter as described in that patch. If
write back failed, *the page is still dirty*. So all future calls to
fsync() need to try to try to flush it again, and (presumably) fail
again (unless it happens to succeed this time around).

--
Thomas Munro
http://www.enterprisedb.com

#7Craig Ringer
craig@2ndquadrant.com
In reply to: Thomas Munro (#6)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On 29 March 2018 at 13:06, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:

On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby <pryzby@telsasoft.com>
wrote:

The retries are the source of the problem ; the first fsync() can return

EIO,

and also *clears the error* causing a 2nd fsync (of the same data) to

return

success.

What I'm failing to grok here is how that error flag even matters,
whether it's a single bit or a counter as described in that patch. If
write back failed, *the page is still dirty*. So all future calls to
fsync() need to try to try to flush it again, and (presumably) fail
again (unless it happens to succeed this time around).
<http://www.enterprisedb.com&gt;

You'd think so. But it doesn't appear to work that way. You can see
yourself with the error device-mapper destination mapped over part of a
volume.

I wrote a test case here.

https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.c

I don't pretend the kernel behaviour is sane. And it's possible I've made
an error in my analysis. But since I've observed this in the wild, and seen
it in a test case, I strongly suspect that's what I've described is just
what's happening, brain-dead or no.

Presumably the kernel marks the page clean when it dispatches it to the I/O
subsystem and doesn't dirty it again on I/O error? I haven't dug that deep
on the kernel side. See the stackoverflow post for details on what I found
in kernel code analysis.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#8Craig Ringer
craig@2ndquadrant.com
In reply to: Thomas Munro (#4)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On 29 March 2018 at 10:48, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:

On Thu, Mar 29, 2018 at 3:30 PM, Michael Paquier <michael@paquier.xyz>
wrote:

On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

Any callers of pg_fsync in the backend code are careful enough to check
the returned status, sometimes doing retries like in mdsync, so what is
proposed here would be a regression.

Craig, is the phenomenon you described the same as the second issue
"Reporting writeback errors" discussed in this article?

https://lwn.net/Articles/724307/

A variant of it, by the looks.

The problem in our case is that the kernel only tells us about the error
once. It then forgets about it. So yes, that seems like a variant of the
statement:

"Current kernels might report a writeback error on an fsync() call,
but there are a number of ways in which that can fail to happen."

That's... I'm speechless.

Yeah.

It's a bit nuts.

I was astonished when I saw the behaviour, and that it appears undocumented.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#9Craig Ringer
craig@2ndquadrant.com
In reply to: Michael Paquier (#3)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On 29 March 2018 at 10:30, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

Any callers of pg_fsync in the backend code are careful enough to check
the returned status, sometimes doing retries like in mdsync, so what is
proposed here would be a regression.

I covered this in my original post.

Yes, we check the return value. But what do we do about it? For fsyncs of
heap files, we ERROR, aborting the checkpoint. We'll retry the checkpoint
later, which will retry the fsync(). **Which will now appear to succeed**
because the kernel forgot that it lost our writes after telling us the
first time. So we do check the error code, which returns success, and we
complete the checkpoint and move on.

But we only retried the fsync, not the writes before the fsync.

So we lost data. Or rather, failed to detect that the kernel did so, so our
checkpoint was bad and could not be completed.

The problem is that we keep retrying checkpoints *without* repeating the
writes leading up to the checkpoint, and retrying fsync.

I don't pretend the kernel behaviour is sane, but we'd better deal with it
anyway.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#10Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On 28 March 2018 at 11:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC as
well to avoid similar lost-page-write issues.

It's not necessary on ext3/ext4 with errors=remount-ro, but that's only
because the FS stops us dead in our tracks.

I don't pretend it's sane. The kernel behaviour is IMO crazy. If it's going
to lose a write, it should at minimum mark the FD as broken so no further
fsync() or anything else can succeed on the FD, and an app that cares about
durability must repeat the whole set of work since the prior succesful
fsync(). Just reporting it once and forgetting it is madness.

But even if we convince the kernel folks of that, how do other platforms
behave? And how long before these kernels are out of use? We'd better deal
with it, crazy or no.

Please see my StackOverflow post for the kernel-level explanation. Note
also the test case link there. https://stackoverflow.com/a/42436054/398670

Retrying fsync() is not OK at

least on Linux. When fsync() returns success it means "all writes since

the

last fsync have hit disk" but we assume it means "all writes since the

last

SUCCESSFUL fsync have hit disk".

If that's actually the case, we need to push back on this kernel brain
damage, because as you're describing it fsync would be completely useless.

It's not useless, it's just telling us something other than what we think
it means. The promise it seems to give us is that if it reports an error
once, everything *after* that is useless, so we should throw our toys,
close and reopen everything, and redo from the last known-good state.

Though as Tomas posted below, it provides rather weaker guarantees than I
thought in some other areas too. See that lwn.net article he linked.

Moreover, POSIX is entirely clear that successful fsync means all
preceding writes for the file have been completed, full stop, doesn't
matter when they were issued.

I can't find anything that says so to me. Please quote relevant spec.

I'm working from
http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html which
states that

"The fsync() function shall request that all data for the open file
descriptor named by fildes is to be transferred to the storage device
associated with the file described by fildes. The nature of the transfer is
implementation-defined. The fsync() function shall not return until the
system has completed that action or until an error is detected."

My reading is that POSIX does not specify what happens AFTER an error is
detected. It doesn't say that error has to be persistent and that
subsequent calls must also report the error. It also says:

"If the fsync() function fails, outstanding I/O operations are not
guaranteed to have been completed."

but that doesn't clarify matters much either, because it can be read to
mean that once there's been an error reported for some IO operations
there's no guarantee those operations are ever completed even after a
subsequent fsync returns success.

I'm not seeking to defend what the kernel seems to be doing. Rather, saying
that we might see similar behaviour on other platforms, crazy or not. I
haven't looked past linux yet, though.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#11Thomas Munro
thomas.munro@gmail.com
In reply to: Craig Ringer (#10)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 28 March 2018 at 11:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC as
well to avoid similar lost-page-write issues.

I found your discussion with kernel hacker Jeff Layton at
https://lwn.net/Articles/718734/ in which he said: "The stackoverflow
writeup seems to want a scheme where pages stay dirty after a
writeback failure so that we can try to fsync them again. Note that
that has never been the case in Linux after hard writeback failures,
AFAIK, so programs should definitely not assume that behavior."

The article above that says the same thing a couple of different ways,
ie that writeback failure leaves you with pages that are neither
written to disk successfully nor marked dirty.

If I'm reading various articles correctly, the situation was even
worse before his errseq_t stuff landed. That fixed cases of
completely unreported writeback failures due to sharing of PG_error
for both writeback and read errors with certain filesystems, but it
doesn't address the clean pages problem.

Yeah, I see why you want to PANIC.

Moreover, POSIX is entirely clear that successful fsync means all
preceding writes for the file have been completed, full stop, doesn't
matter when they were issued.

I can't find anything that says so to me. Please quote relevant spec.

I'm working from
http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html which
states that

"The fsync() function shall request that all data for the open file
descriptor named by fildes is to be transferred to the storage device
associated with the file described by fildes. The nature of the transfer is
implementation-defined. The fsync() function shall not return until the
system has completed that action or until an error is detected."

My reading is that POSIX does not specify what happens AFTER an error is
detected. It doesn't say that error has to be persistent and that subsequent
calls must also report the error. It also says:

FWIW my reading is the same as Tom's. It says "all data for the open
file descriptor" without qualification or special treatment after
errors. Not "some".

I'm not seeking to defend what the kernel seems to be doing. Rather, saying
that we might see similar behaviour on other platforms, crazy or not. I
haven't looked past linux yet, though.

I see no reason to think that any other operating system would behave
that way without strong evidence... This is openly acknowledged to be
"a mess" and "a surprise" in the Filesystem Summit article. I am not
really qualified to comment, but from a cursory glance at FreeBSD's
vfs_bio.c I think it's doing what you'd hope for... see the code near
the comment "Failed write, redirty."

--
Thomas Munro
http://www.enterprisedb.com

#12Craig Ringer
craig@2ndquadrant.com
In reply to: Thomas Munro (#11)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On 29 March 2018 at 20:07, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:

On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer <craig@2ndquadrant.com>
wrote:

On 28 March 2018 at 11:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

TL;DR: Pg should PANIC on fsync() EIO return.

Surely you jest.

No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC

as

well to avoid similar lost-page-write issues.

I found your discussion with kernel hacker Jeff Layton at
https://lwn.net/Articles/718734/ in which he said: "The stackoverflow
writeup seems to want a scheme where pages stay dirty after a
writeback failure so that we can try to fsync them again. Note that
that has never been the case in Linux after hard writeback failures,
AFAIK, so programs should definitely not assume that behavior."

The article above that says the same thing a couple of different ways,
ie that writeback failure leaves you with pages that are neither
written to disk successfully nor marked dirty.

If I'm reading various articles correctly, the situation was even
worse before his errseq_t stuff landed. That fixed cases of
completely unreported writeback failures due to sharing of PG_error
for both writeback and read errors with certain filesystems, but it
doesn't address the clean pages problem.

Yeah, I see why you want to PANIC.

In more ways than one ;)

I'm not seeking to defend what the kernel seems to be doing. Rather,
saying

that we might see similar behaviour on other platforms, crazy or not. I
haven't looked past linux yet, though.

I see no reason to think that any other operating system would behave
that way without strong evidence... This is openly acknowledged to be
"a mess" and "a surprise" in the Filesystem Summit article. I am not
really qualified to comment, but from a cursory glance at FreeBSD's
vfs_bio.c I think it's doing what you'd hope for... see the code near
the comment "Failed write, redirty."

Ok, that's reassuring, but doesn't help us on the platform the great
majority of users deploy on :(

"If on Linux, PANIC"

Hrm.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#13Catalin Iacob
iacobcatalin@gmail.com
In reply to: Thomas Munro (#11)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Thu, Mar 29, 2018 at 2:07 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

I found your discussion with kernel hacker Jeff Layton at
https://lwn.net/Articles/718734/ in which he said: "The stackoverflow
writeup seems to want a scheme where pages stay dirty after a
writeback failure so that we can try to fsync them again. Note that
that has never been the case in Linux after hard writeback failures,
AFAIK, so programs should definitely not assume that behavior."

And a bit below in the same comments, to this question about PG: "So,
what are the options at this point? The assumption was that we can
repeat the fsync (which as you point out is not the case), or shut
down the database and perform recovery from WAL", the same Jeff Layton
seems to agree PANIC is the appropriate response:
"Replaying the WAL synchronously sounds like the simplest approach
when you get an error on fsync. These are uncommon occurrences for the
most part, so having to fall back to slow, synchronous error recovery
modes when this occurs is probably what you want to do.".
And right after, he confirms the errseq_t patches are about always
detecting this, not more:
"The main thing I working on is to better guarantee is that you
actually get an error when this occurs rather than silently corrupting
your data. The circumstances where that can occur require some
corner-cases, but I think we need to make sure that it doesn't occur."

Jeff's comments in the pull request that merged errseq_t are worth
reading as well:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750

The article above that says the same thing a couple of different ways,
ie that writeback failure leaves you with pages that are neither
written to disk successfully nor marked dirty.

If I'm reading various articles correctly, the situation was even
worse before his errseq_t stuff landed. That fixed cases of
completely unreported writeback failures due to sharing of PG_error
for both writeback and read errors with certain filesystems, but it
doesn't address the clean pages problem.

Indeed, that's exactly how I read it as well (opinion formed
independently before reading your sentence above). The errseq_t
patches landed in v4.13 by the way, so very recently.

Yeah, I see why you want to PANIC.

Indeed. Even doing that leaves question marks about all the kernel
versions before v4.13, which at this point is pretty much everything
out there, not even detecting this reliably. This is messy.

#14Thomas Munro
thomas.munro@gmail.com
In reply to: Catalin Iacob (#13)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Fri, Mar 30, 2018 at 5:20 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:

Jeff's comments in the pull request that merged errseq_t are worth
reading as well:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750

Wow. It looks like there may be a separate question of when each
filesystem adopted this new infrastructure?

Yeah, I see why you want to PANIC.

Indeed. Even doing that leaves question marks about all the kernel
versions before v4.13, which at this point is pretty much everything
out there, not even detecting this reliably. This is messy.

The pre-errseq_t problems are beyond our control. There's nothing we
can do about that in userspace (except perhaps abandon OS-buffered IO,
a big project). We just need to be aware that this problem exists in
certain kernel versions and be grateful to Layton for fixing it.

The dropped dirty flag problem is something we can and in my view
should do something about, whatever we might think about that design
choice. As Andrew Gierth pointed out to me in an off-list chat about
this, by the time you've reached this state, both PostgreSQL's buffer
and the kernel's buffer are clean and might be reused for another
block at any time, so your data might be gone from the known universe
-- we don't even have the option to rewrite our buffers in general.
Recovery is the only option.

Thank you to Craig for chasing this down and +1 for his proposal, on Linux only.

--
Thomas Munro
http://www.enterprisedb.com

#15Anthony Iliopoulos
ailiop@altatus.com
In reply to: Thomas Munro (#14)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:

Yeah, I see why you want to PANIC.

Indeed. Even doing that leaves question marks about all the kernel
versions before v4.13, which at this point is pretty much everything
out there, not even detecting this reliably. This is messy.

There may still be a way to reliably detect this on older kernel
versions from userspace, but it will be messy whatsoever. On EIO
errors, the kernel will not restore the dirty page flags, but it
will flip the error flags on the failed pages. One could mmap()
the file in question, obtain the PFNs (via /proc/pid/pagemap)
and enumerate those to match the ones with the error flag switched
on (via /proc/kpageflags). This could serve at least as a detection
mechanism, but one could also further use this info to logically
map the pages that failed IO back to the original file offsets,
and potentially retry IO just for those file ranges that cover
the failed pages. Just an idea, not tested.

Best regards,
Anthony

#16Craig Ringer
craig@2ndquadrant.com
In reply to: Anthony Iliopoulos (#15)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On 31 March 2018 at 21:24, Anthony Iliopoulos <ailiop@altatus.com> wrote:

On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:

Yeah, I see why you want to PANIC.

Indeed. Even doing that leaves question marks about all the kernel
versions before v4.13, which at this point is pretty much everything
out there, not even detecting this reliably. This is messy.

There may still be a way to reliably detect this on older kernel
versions from userspace, but it will be messy whatsoever. On EIO
errors, the kernel will not restore the dirty page flags, but it
will flip the error flags on the failed pages. One could mmap()
the file in question, obtain the PFNs (via /proc/pid/pagemap)
and enumerate those to match the ones with the error flag switched
on (via /proc/kpageflags). This could serve at least as a detection
mechanism, but one could also further use this info to logically
map the pages that failed IO back to the original file offsets,
and potentially retry IO just for those file ranges that cover
the failed pages. Just an idea, not tested.

That sounds like a huge amount of complexity, with uncertainty as to how
it'll behave kernel-to-kernel, for negligble benefit.

I was exploring the idea of doing selective recovery of one relfilenode,
based on the assumption that we know the filenode related to the fd that
failed to fsync(). We could redo only WAL on that relation. But it fails
the same test: it's too complex for a niche case that shouldn't happen in
the first place, so it'll probably have bugs, or grow bugs in bitrot over
time.

Remember, if you're on ext4 with errors=remount-ro, you get shut down even
harder than a PANIC. So we should just use the big hammer here.

I'll send a patch this week.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#16)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Craig Ringer <craig@2ndquadrant.com> writes:

So we should just use the big hammer here.

And bitch, loudly and publicly, about how broken this kernel behavior is.
If we make enough of a stink maybe it'll get fixed.

regards, tom lane

#18Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#17)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

So we should just use the big hammer here.

And bitch, loudly and publicly, about how broken this kernel behavior is.
If we make enough of a stink maybe it'll get fixed.

That won't fix anything released already, so as per the information
gathered something has to be done anyway. The discussion of this thread
is spreading quite a lot actually.

Handling things at a low-level looks like a better plan for the backend.
Tools like pg_basebackup and pg_dump also issue fsync's on the data
created, we should do an equivalent for them, with some exit() calls in
file_utils.c. As of now failures are logged to stderr but not
considered fatal.
--
Michael

#19Anthony Iliopoulos
ailiop@altatus.com
In reply to: Craig Ringer (#16)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Sun, Apr 01, 2018 at 12:13:09AM +0800, Craig Ringer wrote:

On 31 March 2018 at 21:24, Anthony Iliopoulos <[1]ailiop@altatus.com>
wrote:

On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:

Yeah, I see why you want to PANIC.

Indeed. Even doing that leaves question marks about all the kernel
versions before v4.13, which at this point is pretty much everything
out there, not even detecting this reliably. This is messy.

There may still be a way to reliably detect this on older kernel
versions from userspace, but it will be messy whatsoever. On EIO
errors, the kernel will not restore the dirty page flags, but it
will flip the error flags on the failed pages. One could mmap()
the file in question, obtain the PFNs (via /proc/pid/pagemap)
and enumerate those to match the ones with the error flag switched
on (via /proc/kpageflags). This could serve at least as a detection
mechanism, but one could also further use this info to logically
map the pages that failed IO back to the original file offsets,
and potentially retry IO just for those file ranges that cover
the failed pages. Just an idea, not tested.

That sounds like a huge amount of complexity, with uncertainty as to how
it'll behave kernel-to-kernel, for negligble benefit.

Those interfaces have been around since the kernel 2.6 times and are
rather stable, but I was merely responding to your original post comment
regarding having a way of finding out which page(s) failed. I assume
that indeed there would be no benefit, especially since those errors
are usually not transient (typically they come from hard medium faults),
and although a filesystem could theoretically mask the error by allocating
a different logical block, I am not aware of any implementation that
currently does that.

I was exploring the idea of doing selective recovery of one relfilenode,
based on the assumption that we know the filenode related to the fd that
failed to fsync(). We could redo only WAL on that relation. But it fails
the same test: it's too complex for a niche case that shouldn't happen in
the first place, so it'll probably have bugs, or grow bugs in bitrot over
time.

Fully agree, those cases should be sufficiently rare that a complex
and possibly non-maintainable solution is not really warranted.

Remember, if you're on ext4 with errors=remount-ro, you get shut down even
harder than a PANIC. So we should just use the big hammer here.

I am not entirely sure what you mean here, does Pg really treat write()
errors as fatal? Also, the kind of errors that ext4 detects with this
option is at the superblock level and govern metadata rather than actual
data writes (recall that those are buffered anyway, no actual device IO
has to take place at the time of write()).

Best regards,
Anthony

#20Anthony Iliopoulos
ailiop@altatus.com
In reply to: Tom Lane (#17)
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

So we should just use the big hammer here.

And bitch, loudly and publicly, about how broken this kernel behavior is.
If we make enough of a stink maybe it'll get fixed.

It is not likely to be fixed (beyond what has been done already with the
manpage patches and errseq_t fixes on the reporting level). The issue is,
the kernel needs to deal with hard IO errors at that level somehow, and
since those errors typically persist, re-dirtying the pages would not
really solve the problem (unless some filesystem remaps the request to a
different block, assuming the device is alive). Keeping around dirty
pages that cannot possibly be written out is essentially a memory leak,
as those pages would stay around even after the application has exited.

Best regards,
Anthony

#21Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#14)
#22Craig Ringer
craig@2ndquadrant.com
In reply to: Thomas Munro (#21)
#23Andres Freund
andres@anarazel.de
In reply to: Anthony Iliopoulos (#20)
#24Anthony Iliopoulos
ailiop@altatus.com
In reply to: Andres Freund (#23)
#25Andres Freund
andres@anarazel.de
In reply to: Anthony Iliopoulos (#24)
#26Anthony Iliopoulos
ailiop@altatus.com
In reply to: Andres Freund (#25)
#27Stephen Frost
sfrost@snowman.net
In reply to: Anthony Iliopoulos (#26)
#28Anthony Iliopoulos
ailiop@altatus.com
In reply to: Stephen Frost (#27)
#29Andres Freund
andres@anarazel.de
In reply to: Anthony Iliopoulos (#28)
#30Craig Ringer
craig@2ndquadrant.com
In reply to: Anthony Iliopoulos (#28)
#31Christophe Pettus
xof@thebuild.com
In reply to: Craig Ringer (#30)
#32Andres Freund
andres@anarazel.de
In reply to: Christophe Pettus (#31)
#33Christophe Pettus
xof@thebuild.com
In reply to: Andres Freund (#32)
In reply to: Andres Freund (#32)
#35Thomas Munro
thomas.munro@gmail.com
In reply to: Craig Ringer (#22)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Anthony Iliopoulos (#24)
In reply to: Robert Haas (#36)
#38Anthony Iliopoulos
ailiop@altatus.com
In reply to: Robert Haas (#36)
#39Bruce Momjian
bruce@momjian.us
In reply to: Anthony Iliopoulos (#38)
#40Anthony Iliopoulos
ailiop@altatus.com
In reply to: Bruce Momjian (#39)
#41Craig Ringer
craig@2ndquadrant.com
In reply to: Robert Haas (#36)
#42Bruce Momjian
bruce@momjian.us
In reply to: Anthony Iliopoulos (#40)
#43Anthony Iliopoulos
ailiop@altatus.com
In reply to: Bruce Momjian (#42)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Anthony Iliopoulos (#38)
#45Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#35)
#46Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#44)
#47Thomas Munro
thomas.munro@gmail.com
In reply to: Bruce Momjian (#46)
#48Bruce Momjian
bruce@momjian.us
In reply to: Thomas Munro (#47)
#49Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#48)
#50Craig Ringer
craig@2ndquadrant.com
In reply to: Robert Haas (#44)
#51Thomas Munro
thomas.munro@gmail.com
In reply to: Bruce Momjian (#49)
#52Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#51)
#53Craig Ringer
craig@2ndquadrant.com
In reply to: Thomas Munro (#52)
#54Thomas Munro
thomas.munro@gmail.com
In reply to: Craig Ringer (#53)
#55Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#53)
#56Bruce Momjian
bruce@momjian.us
In reply to: Thomas Munro (#54)
#57Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#50)
#58Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#55)
#59Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#58)
#60Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#59)
#61Craig Ringer
craig@2ndquadrant.com
In reply to: Bruce Momjian (#60)
#62Anthony Iliopoulos
ailiop@altatus.com
In reply to: Craig Ringer (#61)
#63Craig Ringer
craig@2ndquadrant.com
In reply to: Bruce Momjian (#56)
#64Gasper Zejn
zejn@owca.info
In reply to: Bruce Momjian (#56)
#65Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#63)
#66Thomas Munro
thomas.munro@gmail.com
In reply to: Craig Ringer (#58)
#67Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#66)
#68Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#1)
#69Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#68)
#70Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#68)
#71Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Craig Ringer (#63)
#72Craig Ringer
craig@2ndquadrant.com
In reply to: Andrew Gierth (#71)
#73Thomas Munro
thomas.munro@gmail.com
In reply to: Craig Ringer (#72)
#74Craig Ringer
craig@2ndquadrant.com
In reply to: Thomas Munro (#73)
#75Thomas Munro
thomas.munro@gmail.com
In reply to: Craig Ringer (#74)
#76Bruce Momjian
bruce@momjian.us
In reply to: Thomas Munro (#75)
#77Christophe Pettus
xof@thebuild.com
In reply to: Bruce Momjian (#76)
#78Craig Ringer
craig@2ndquadrant.com
In reply to: Thomas Munro (#75)
In reply to: Craig Ringer (#78)
#80Christophe Pettus
xof@thebuild.com
In reply to: Craig Ringer (#78)
#81Andreas Karlsson
andreas.karlsson@percona.com
In reply to: Craig Ringer (#78)
#82Craig Ringer
craig@2ndquadrant.com
In reply to: Christophe Pettus (#80)
#83Craig Ringer
craig@2ndquadrant.com
In reply to: Andreas Karlsson (#81)
#84Christophe Pettus
xof@thebuild.com
In reply to: Craig Ringer (#82)
#85Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#78)
#86Christophe Pettus
xof@thebuild.com
In reply to: Bruce Momjian (#85)
#87Anthony Iliopoulos
ailiop@altatus.com
In reply to: Bruce Momjian (#85)
#88Bruce Momjian
bruce@momjian.us
In reply to: Christophe Pettus (#84)
#89Christophe Pettus
xof@thebuild.com
In reply to: Bruce Momjian (#88)
#90Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#88)
#91Christophe Pettus
xof@thebuild.com
In reply to: Andres Freund (#90)
#92Craig Ringer
craig@2ndquadrant.com
In reply to: Christophe Pettus (#86)
#93Craig Ringer
craig@2ndquadrant.com
In reply to: Bruce Momjian (#88)
#94Andres Freund
andres@anarazel.de
In reply to: Christophe Pettus (#91)
#95Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#90)
#96Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#95)
#97Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#96)
#98Bruce Momjian
bruce@momjian.us
In reply to: Anthony Iliopoulos (#87)
#99Anthony Iliopoulos
ailiop@altatus.com
In reply to: Bruce Momjian (#98)
#100Geoff Winkless
pgsqladmin@geoff.dj
In reply to: Anthony Iliopoulos (#99)
#101Craig Ringer
craig@2ndquadrant.com
In reply to: Anthony Iliopoulos (#99)
#102Anthony Iliopoulos
ailiop@altatus.com
In reply to: Geoff Winkless (#100)
#103Anthony Iliopoulos
ailiop@altatus.com
In reply to: Craig Ringer (#101)
#104Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Anthony Iliopoulos (#102)
#105Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Bruce Momjian (#88)
#106Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Tomas Vondra (#105)
#107Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Craig Ringer (#95)
#108Anthony Iliopoulos
ailiop@altatus.com
In reply to: Tomas Vondra (#104)
#109Bruce Momjian
bruce@momjian.us
In reply to: Anthony Iliopoulos (#108)
#110Robert Haas
robertmhaas@gmail.com
In reply to: Craig Ringer (#101)
#111Joshua D. Drake
jd@commandprompt.com
In reply to: Robert Haas (#110)
#112Gasper Zejn
zejn@owca.info
In reply to: Tomas Vondra (#105)
#113Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Joshua D. Drake (#111)
#114Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#110)
#115Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#114)
#116Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Mark Dilger (#113)
In reply to: Andres Freund (#115)
#118Anthony Iliopoulos
ailiop@altatus.com
In reply to: Bruce Momjian (#109)
#119Andres Freund
andres@anarazel.de
In reply to: Anthony Iliopoulos (#118)
#120Andres Freund
andres@anarazel.de
In reply to: Anthony Iliopoulos (#118)
#121Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#120)
#122Anthony Iliopoulos
ailiop@altatus.com
In reply to: Andres Freund (#119)
#123Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Anthony Iliopoulos (#108)
#124Anthony Iliopoulos
ailiop@altatus.com
In reply to: Andres Freund (#120)
#125Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#120)
#126Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#121)
#127Andres Freund
andres@anarazel.de
In reply to: Tomas Vondra (#125)
#128Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Andres Freund (#115)
#129Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#127)
#130Andres Freund
andres@anarazel.de
In reply to: Mark Dilger (#128)
#131Andres Freund
andres@anarazel.de
In reply to: Tomas Vondra (#129)
#132Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Mark Dilger (#128)
#133Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Tomas Vondra (#132)
#134Andres Freund
andres@anarazel.de
In reply to: Mark Dilger (#133)
#135Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#134)
#136Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Tomas Vondra (#135)
#137Thomas Munro
thomas.munro@gmail.com
In reply to: Anthony Iliopoulos (#108)
#138Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#137)
#139Andreas Karlsson
andreas.karlsson@percona.com
In reply to: Craig Ringer (#101)
#140Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#126)
#141Thomas Munro
thomas.munro@gmail.com
In reply to: Craig Ringer (#140)
#142Craig Ringer
craig@2ndquadrant.com
In reply to: Mark Dilger (#128)
#143Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#131)
#144Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#143)
#145Craig Ringer
craig@2ndquadrant.com
In reply to: Andreas Karlsson (#139)
#146Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#114)
#147Craig Ringer
craig@2ndquadrant.com
In reply to: Michael Paquier (#146)
#148Michael Paquier
michael@paquier.xyz
In reply to: Craig Ringer (#147)
#149Craig Ringer
craig@2ndquadrant.com
In reply to: Michael Paquier (#148)
#150Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#115)
#151Robert Haas
robertmhaas@gmail.com
In reply to: Craig Ringer (#147)
#152Anthony Iliopoulos
ailiop@altatus.com
In reply to: Robert Haas (#150)
#153Bruce Momjian
bruce@momjian.us
In reply to: Anthony Iliopoulos (#152)
#154Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#143)
#155Joshua D. Drake
jd@commandprompt.com
In reply to: Bruce Momjian (#154)
#156Joshua D. Drake
jd@commandprompt.com
In reply to: Joshua D. Drake (#155)
#157Joshua D. Drake
jd@commandprompt.com
In reply to: Joshua D. Drake (#156)
#158Jonathan Corbet
corbet@lwn.net
In reply to: Anthony Iliopoulos (#152)
#159Bruce Momjian
bruce@momjian.us
In reply to: Joshua D. Drake (#155)
#160Andres Freund
andres@anarazel.de
In reply to: Jonathan Corbet (#158)
#161Jonathan Corbet
corbet@lwn.net
In reply to: Andres Freund (#160)
#162Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#154)
#163Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#105)
#164Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#105)
#165Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#163)
#166Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#164)
#167Bruce Momjian
bruce@momjian.us
In reply to: Peter Geoghegan (#117)
#168Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#165)
#169Craig Ringer
craig@2ndquadrant.com
In reply to: Bruce Momjian (#162)
#170Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#149)
#171Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#169)
#172Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#166)
#173Craig Ringer
craig@2ndquadrant.com
In reply to: Bruce Momjian (#171)
#174Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Craig Ringer (#173)
#175Craig Ringer
craig@2ndquadrant.com
In reply to: Mark Kirkwood (#174)
#176Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#173)
#177Gasper Zejn
zejn@owca.info
In reply to: Craig Ringer (#7)
#178Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#1)
#179Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#178)
#180Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#178)
#181Thomas Munro
thomas.munro@gmail.com
In reply to: Bruce Momjian (#179)
#182Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#180)