Standbys which don't synch to disk?
Fujii, Simon,
For 9.1, both master and replica in a sync replication relationship are
required to be fsync'ing to disk. I understand why we had to do that
for our first cut at synch rep. Do you think, though, that it might
become possible to replicate without synch-to-disk for 9.2?
The use case I have is cloud hosting, where I'd rather have two or three
synchronous standbys than synch to disk.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
On Wed, May 11, 2011 at 1:12 PM, Josh Berkus <josh@agliodbs.com> wrote:
For 9.1, both master and replica in a sync replication relationship are
required to be fsync'ing to disk. I understand why we had to do that
for our first cut at synch rep. Do you think, though, that it might
become possible to replicate without synch-to-disk for 9.2?The use case I have is cloud hosting, where I'd rather have two or three
synchronous standbys than synch to disk.
It's already possible to set fsync=off on the standby if you want. If
there is an OS-level crash you'll need to rebuild the standby, but in
some cases that may be acceptable.
And Simon has already written a patch to add a "receive" mode to sync
rep, which I expected will get committed to 9.2. In that mode, the
standby can acknowledge the WAL records as soon as they are received,
and write them to disk just after. I think we do need some
benchmarking there, to figure out whether any changes to the timing of
replies are needed in that case. But the basic principal seems sound.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
It's already possible to set fsync=off on the standby if you want. If
there is an OS-level crash you'll need to rebuild the standby, but in
some cases that may be acceptable.
Yes, generally if there's an OS-level crash on cloud hosting, you've
lost the instance anyway.
And Simon has already written a patch to add a "receive" mode to sync
rep, which I expected will get committed to 9.2. In that mode, the
standby can acknowledge the WAL records as soon as they are received,
and write them to disk just after. I think we do need some
benchmarking there, to figure out whether any changes to the timing of
replies are needed in that case. But the basic principal seems sound.
Yes, that's what I'm looking for. The one other thing would be the
ability not to fsync the master, which would come out of the whole
"stream from buffers" patch which Fujii was working on. Fujii, is that
still something you're working on?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes:
It's already possible to set fsync=off on the standby if you want. If
there is an OS-level crash you'll need to rebuild the standby, but in
some cases that may be acceptable.
... The one other thing would be the
ability not to fsync the master, which would come out of the whole
"stream from buffers" patch which Fujii was working on. Fujii, is that
still something you're working on?
Huh? Surely you can just set fsync=off on the master if you feel like
it. Data integrity not guaranteed, of course, but if you don't care...
regards, tom lane
Robert,
That WAL has effectively disappeared from the
master, but is still present on the slave. Now the master comes up
and starts processing read-write transactions again, and generates a
new and different 1kB of WAL. Hilarity ensues, because the two
machines are now out of step with each other.
Yeah, you'd need some kind of instant failover and STONITH. That is,
any interruption on the master would be a failover situation. While
that seems conceivable for crashes, consider that a planned restart of
the master might be an issue, and an OOM-kill would certainly be.
You could possibly fix this by making provision for the master to
connect to the slave on start-up and stream WAL "backwards" from slave
to master. That'd be pretty spiffy.
Ouch, now you're making my head hurt.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
Import Notes
Reply to msg id not found: BANLkTi_3d1Ev-Jxc12Ywf0idRLPgPdg@mail.gmail.com
On Thu, May 12, 2011 at 3:48 AM, Josh Berkus <josh@agliodbs.com> wrote:
Robert,
That WAL has effectively disappeared from the
master, but is still present on the slave. Now the master comes up
and starts processing read-write transactions again, and generates a
new and different 1kB of WAL. Hilarity ensues, because the two
machines are now out of step with each other.Yeah, you'd need some kind of instant failover and STONITH. That is,
any interruption on the master would be a failover situation. While
that seems conceivable for crashes, consider that a planned restart of
the master might be an issue, and an OOM-kill would certainly be.You could possibly fix this by making provision for the master to
connect to the slave on start-up and stream WAL "backwards" from slave
to master. That'd be pretty spiffy.Ouch, now you're making my head hurt.
I believe many people who use SR with a clusterware would do failover
instead of restarting the master when it crashes. So I don't think it's
bad idea to allow them to use the stream-WAL-from-buffers feature
with self-responsibility. It's the same thing as we can specify fsync=off
or full_page_writes=off.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center