Transaction-controlled robustness for replication

Started by Simon Riggsover 17 years ago41 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

One of the cool features of 8.3 was the ability to control at the
transaction level whether we would use synchronous or asynchronous
commit.

We're planning to add integrated replication features to 8.4, and I
think it will be possible to extend the concept of asynchronous commit
to a more general form of transaction-level robustness. Note that the
proof that its possible to mix asynchronous and synchronous transactions
on the same system has already been established, so this is just a
straight extension of that concept.

Asynchronous commit controls whether we go to disk at time of commit, or
whether we defer this slightly. We have the same options with
replication: do we replicate at time of commit, or do we defer this
slightly for performance reasons. DRBD and other replication systems
show us that there is actually another difference when talking about
synchronous replication: do we go to disk on the standby before
acknowledging the primary?

We can generalise this as three closed questions, answered either Yes
(Synchronous) or No (Asynchronous)

* Does WAL get forced to disk on primary at commit time?
* Does WAL get forced across link to standby at commit time?
* Does WAL get forced to disk on standby at commit time?

In code, these are simple if tests: Do we wait, or not?

We could represent this with 3 parameters:
synchronous_commit = on | off
synchronous_standby_transfer = on | off
synchronous_standby_wal_fsync = on | off

If we are able to define these robustness characteristics for each
transaction *separately* then it will represent an industry first: no
other database has that capability implemented or planned on published
roadmaps, nor has it been discussed in research to my knowledge.

Changing the parameter setting at transaction-level would be expensive
if we had to set three parameters.

Or we could use just two parameters:
synchronous_commit = on | off
synchronous_replication = 'AA', 'SA' or 'SS'
with A = Asynchronous, S = Synchronous
which corresponds with DRBD's algorithms like this
DRBD A = AA
DRBD B = SA
DRBD C = SS

Or we could use just a single parameter
synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no
log-based replication is defined

Having the ability to set these at the transaction-level would be very
cool. Having it set via a *single* parameter would make it much more
viable to switch between AAA for bulk, low importance data and SSS for
very low volume, critical data, or somewhere in between on the same
server, at the same time.

So proposal in summary is
* allow various modes of synchronous replication for perf/robustness
* allow modes to be specified per-transaction
* allow modes to be specified as a single parameter

I think Itagaki may have described similar concepts at PGCon2008, but
this thread has been started to make sure that meme definitely has been
released into the wild, and to discuss how we might specify it?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#2Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#1)
Re: Transaction-controlled robustness for replication

Hi,

very nice proposal and thoughts. Allow me to compare against Postgres-R
again.

Simon Riggs wrote:

Asynchronous commit controls whether we go to disk at time of commit, or
whether we defer this slightly. We have the same options with
replication: do we replicate at time of commit, or do we defer this
slightly for performance reasons. DRBD and other replication systems
show us that there is actually another difference when talking about
synchronous replication: do we go to disk on the standby before
acknowledging the primary?

Yeah, I was thinking into the same direction for Postgres-R. There
already exist three replication levels: sync, eager and lazy.

Having more than just a primary and a standby server in mind, one can
also argue about how many remote nodes need to have written the changes
to disc, before commit is confirmed in 'sync' mode. At least a majority
is required, probably more nodes are wanted.

The eager mode is what the original Postgres-R approach is all about and
is pretty much the only one I've implemented, at the moment. It only
requires confirmation from the GCS, which means at least a majority of
the nodes have received the change set (and will be able to apply it).
(This leads to a corner case for a full cluster outage, see [1]One of the few threads on the Postgres-R-general mailing list: http://pgfoundry.org/pipermail/postgres-r-general/2006-August/000002.html).

In async mode, commit is confirmed before sending the change set to
other nodes.

If we are able to define these robustness characteristics for each
transaction *separately* then it will represent an industry first:

Yeah, that would be pretty cool.

no
other database has that capability implemented or planned on published
roadmaps, nor has it been discussed in research to my knowledge.

Well, a partial implementation in Postgres-R, if that counts... ;-)

Regards

Markus

[1]: One of the few threads on the Postgres-R-general mailing list: http://pgfoundry.org/pipermail/postgres-r-general/2006-August/000002.html
http://pgfoundry.org/pipermail/postgres-r-general/2006-August/000002.html

#3Marko Kreen
markokr@gmail.com
In reply to: Simon Riggs (#1)
Re: Transaction-controlled robustness for replication

On 7/22/08, Simon Riggs <simon@2ndquadrant.com> wrote:

We could represent this with 3 parameters:
synchronous_commit = on | off
synchronous_standby_transfer = on | off
synchronous_standby_wal_fsync = on | off

If we are able to define these robustness characteristics for each
transaction *separately* then it will represent an industry first: no
other database has that capability implemented or planned on published
roadmaps, nor has it been discussed in research to my knowledge.

Changing the parameter setting at transaction-level would be expensive
if we had to set three parameters.

How about extending BEGIN.with additional keywords?

--
marko

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Marko Kreen (#3)
Re: Transaction-controlled robustness for replication

On Wed, 2008-07-23 at 01:39 +0300, Marko Kreen wrote:

On 7/22/08, Simon Riggs <simon@2ndquadrant.com> wrote:

We could represent this with 3 parameters:
synchronous_commit = on | off
synchronous_standby_transfer = on | off
synchronous_standby_wal_fsync = on | off

If we are able to define these robustness characteristics for each
transaction *separately* then it will represent an industry first: no
other database has that capability implemented or planned on published
roadmaps, nor has it been discussed in research to my knowledge.

Changing the parameter setting at transaction-level would be expensive
if we had to set three parameters.

How about extending BEGIN.with additional keywords?

SQL Standard, plus many interfaces hide BEGIN from you.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In reply to: Simon Riggs (#1)
Re: Transaction-controlled robustness for replication

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Simon Riggs wrote:

Asynchronous commit controls whether we go to disk at time of commit, or
whether we defer this slightly. We have the same options with
replication: do we replicate at time of commit, or do we defer this
slightly for performance reasons. DRBD and other replication systems
show us that there is actually another difference when talking about
synchronous replication: do we go to disk on the standby before
acknowledging the primary?

We can generalise this as three closed questions, answered either Yes
(Synchronous) or No (Asynchronous)

* Does WAL get forced to disk on primary at commit time?
* Does WAL get forced across link to standby at commit time?
* Does WAL get forced to disk on standby at commit time?

* Does WAL get applied [and synced] to disk on standby at commit time?
This is important if you want to use the standby as a read-only.
I am slightly confused about what the fsync setting does to all this, hence
the brackets.

I think that questions 2 and 3 are trivially bundled together. Once the
user can specify 2, implementing 3 should be trivial and vice versa.
I am not even convinced that these need to be two different parameters.
Also please note that an answer of "yes" to 3 means that 2 must also
be answered "yes".

We could represent this with 3 parameters:
synchronous_commit = on | off
synchronous_standby_transfer = on | off
synchronous_standby_wal_fsync = on | off

synchronous_standby_apply = on | off # just to propose a name

Changing the parameter setting at transaction-level would be expensive
if we had to set three parameters.

What exactly does "expensive" mean? All three parameters can probably be set
in one TCP packet from client to server.

Or we could use just a single parameter
synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no
log-based replication is defined

Having the ability to set these at the transaction-level would be very
cool. Having it set via a *single* parameter would make it much more
viable to switch between AAA for bulk, low importance data and SSS for
very low volume, critical data, or somewhere in between on the same
server, at the same time.

The problem with a single parameter is that everything becomes position
dependent and if whyever a new parameter is introduced, it's not easy to
upgrade old application code.

So proposal in summary is
* allow various modes of synchronous replication for perf/robustness
* allow modes to be specified per-transaction
* allow modes to be specified as a single parameter

How about creating named modes? This would give the user the ability to
define more fine-grained control especially in larger clusters of fail-over/read-only
servers without totally clogging the parameter space and application code.
Whether this should be done SQL-style or in some config file is not so clear to me,
although I'd prefer SQL-style like

CREATE SYNCHRONIZING MODE immediate_readonly AS
LOCAL SYNCHRONOUS APPLY
192.168.0.10 SYNCHRONOUS APPLY -- read-only slave
192.168.0.11 SYNCHRONOUS APPLY -- read-only slave
192.168.0.20 SYNCHRONOUS SHIP -- backup-server
192.168.0.21 SYNCHRONOUS SHIP -- backup-server
192.168.0.30 SYNHCRONOUS FSYNC -- backup-server with fast disks
;

and then something like

synchronize_mode = immediate_readonly;

Yeah, I know, give patches not pipe-dreams :)

Regards,
Jens-Wolfhard Schicke
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIhoAdzhchXT4RR5ARAo/6AJ9R6LA0TsPvD/TBy6Bh1q0q5JvyKQCbBycx
1CKc8dqxnlvmH/hbi1Px+v8=
=l5P4
-----END PGP SIGNATURE-----

#6Markus Wanner
markus@bluegap.ch
In reply to: Jens-Wolfhard Schicke (#5)
Re: Transaction-controlled robustness for replication

Hi,

Jens-Wolfhard Schicke wrote:

* Does WAL get forced to disk on primary at commit time?
* Does WAL get forced across link to standby at commit time?
* Does WAL get forced to disk on standby at commit time?

* Does WAL get applied [and synced] to disk on standby at commit time?

I think that's what Simon means by his question no 3. It wouldn't make
much sense to me otherwise.

I'm assuming the standby node has it's own physical format, so the
changes from the remote WAL need to be transformed to a local WAL, which
then needs to be written to disc. For Postgres, this pretty much means
applying the changes and committing them. You never need to store the
remote WAL on physical storage, what would that be good for?

I think that questions 2 and 3 are trivially bundled together. Once the
user can specify 2, implementing 3 should be trivial and vice versa.

That might well be, yes. The code to collect changes from a transaction
and then apply them remotely is pretty much the same, no matter when it
is being executed. But it certainly makes a difference in the balance
between performance and availability, which is a decision the user
should be able to make for his specific application (or even better, per
transaction, as proposed here and in Postgres-R).

I am not even convinced that these need to be two different parameters.

Consider a standby heavily loaded (i/o) with some OLAP queries. Why
should the master wait until the standby has written anything to disk
for him?

Also please note that an answer of "yes" to 3 means that 2 must also
be answered "yes".

Agreed. There's no 'AS' mode possible, only 'SS', 'SA' and 'SS'.

How about creating named modes? This would give the user the ability to
define more fine-grained control especially in larger clusters of fail-over/read-only
servers without totally clogging the parameter space and application code.
Whether this should be done SQL-style or in some config file is not so clear to me,
although I'd prefer SQL-style like

CREATE SYNCHRONIZING MODE immediate_readonly AS
LOCAL SYNCHRONOUS APPLY
192.168.0.10 SYNCHRONOUS APPLY -- read-only slave
192.168.0.11 SYNCHRONOUS APPLY -- read-only slave
192.168.0.20 SYNCHRONOUS SHIP -- backup-server
192.168.0.21 SYNCHRONOUS SHIP -- backup-server
192.168.0.30 SYNHCRONOUS FSYNC -- backup-server with fast disks
;

Hm.. that's an interesting idea. Especially considering the number of
options that arise with more than two or three nodes, where you maybe
also want to specify how many nodes must have written the changes to
disk before confirming the commit.

In Postgres-R, I've added a TRANSACTION REPLICATION LEVEL, which can be
either SYNC, EAGER or LAZY. Maybe that's not quite sufficient. On the
other hand, I don't think any other option here makes any sense. (Above,
you yourself doubt that sync is different enough from eager).

Regards

Markus

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Jens-Wolfhard Schicke (#5)
Re: Transaction-controlled robustness for replication

On Wed, 2008-07-23 at 10:49 +1000, Jens-Wolfhard Schicke wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Simon Riggs wrote:

Asynchronous commit controls whether we go to disk at time of commit, or
whether we defer this slightly. We have the same options with
replication: do we replicate at time of commit, or do we defer this
slightly for performance reasons. DRBD and other replication systems
show us that there is actually another difference when talking about
synchronous replication: do we go to disk on the standby before
acknowledging the primary?

We can generalise this as three closed questions, answered either Yes
(Synchronous) or No (Asynchronous)

* Does WAL get forced to disk on primary at commit time?
* Does WAL get forced across link to standby at commit time?
* Does WAL get forced to disk on standby at commit time?

* Does WAL get applied [and synced] to disk on standby at commit time?

This is important if you want to use the standby as a read-only.

That's an assumption - I'm not sure its a requirement in all cases.

If a standby query needed to see particular data then the *query* would
wait until correct data has been applied. I certainly wouldn't want to
penalise writing transactions on the primary because there *might* be a
statement on the standby that wishes to see an updated view.

I am slightly confused about what the fsync setting does to all this, hence
the brackets.

There is no sync() during WAL apply when each individual transaction
hits commit. This is because there is "no WAL" i.e. changes comes from
WAL to the database, so we have no need of a second WAL to protect the
changes being made.

I think that questions 2 and 3 are trivially bundled together. Once the
user can specify 2, implementing 3 should be trivial and vice versa.
I am not even convinced that these need to be two different parameters.
Also please note that an answer of "yes" to 3 means that 2 must also
be answered "yes".

Yes, they are trivially bundled together, but there is benefit in doing
so. The difference between 2 and 3 is about performance and levels of
robustness.

Waiting for transfer across link to standby (only) is much faster than
waiting for transfer *and* waiting for fsync. Probably twice as fast in
a tightly coupled cluster, i.e. option 3 will make your transactions
somewhat more robust, but twice the response time and half the
throughput.

We could represent this with 3 parameters:
synchronous_commit = on | off
synchronous_standby_transfer = on | off
synchronous_standby_wal_fsync = on | off

synchronous_standby_apply = on | off # just to propose a name

Changing the parameter setting at transaction-level would be expensive
if we had to set three parameters.

What exactly does "expensive" mean? All three parameters can probably be set
in one TCP packet from client to server.

Expensive as in we need to parse and handle each statement separately.
If we have a single parameter then much lower overhead.

Or we could use just a single parameter
synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no
log-based replication is defined

Having the ability to set these at the transaction-level would be very
cool. Having it set via a *single* parameter would make it much more
viable to switch between AAA for bulk, low importance data and SSS for
very low volume, critical data, or somewhere in between on the same
server, at the same time.

The problem with a single parameter is that everything becomes position
dependent and if whyever a new parameter is introduced, it's not easy to
upgrade old application code.

True, but what new parameter do you imagine?

So proposal in summary is
* allow various modes of synchronous replication for perf/robustness
* allow modes to be specified per-transaction
* allow modes to be specified as a single parameter

How about creating named modes?

Good idea

This would give the user the ability to
define more fine-grained control especially in larger clusters of fail-over/read-only
servers without totally clogging the parameter space and application code.
Whether this should be done SQL-style or in some config file is not so clear to me,
although I'd prefer SQL-style like

CREATE SYNCHRONIZING MODE immediate_readonly AS
LOCAL SYNCHRONOUS APPLY
192.168.0.10 SYNCHRONOUS APPLY -- read-only slave
192.168.0.11 SYNCHRONOUS APPLY -- read-only slave
192.168.0.20 SYNCHRONOUS SHIP -- backup-server
192.168.0.21 SYNCHRONOUS SHIP -- backup-server
192.168.0.30 SYNHCRONOUS FSYNC -- backup-server with fast disks
;

Thats not how we define parameter values, so no.

and then something like

synchronize_mode = immediate_readonly;

Yeah, I know, give patches not pipe-dreams :)

Ah yes. Of course.

The only sensible options are these four:

AAA
SAA
SSA
SSS

plus the existing on & off

So we give them 4 names and set it using a single parameter value.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#8Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#7)
Re: Transaction-controlled robustness for replication

Hi,

Simon Riggs wrote:

There is no sync() during WAL apply when each individual transaction
hits commit. This is because there is "no WAL" i.e. changes comes from
WAL to the database, so we have no need of a second WAL to protect the
changes being made.

Aha, that makes much more sense to me now. So you are not just using the
WAL to transfer changes, but the remote WAL replaces the one on the
standby completely.

Do you have replication to multiple slaves on the radar?

What I still don't understand is, why you are speaking about "logical"
replication. It rather sounds like an ordinary log shipping approach,
where the complete WAL is sent over the wire. Nothing wrong with that,
it certainly fits many needs and I welcome such a solution for Postgres.

In another thread, you are saying that application of the WAL on the
standby is "logical". This sounds like one could choose to skip certain
parts of the WAL on the standby, but still the complete WAL would have
to be sent. So this would only save a bit of i/o on the standby, right?

Expensive as in we need to parse and handle each statement separately.
If we have a single parameter then much lower overhead.

Is that really much of a concern when otherwise caring about network and
i/o latency? And what about sane default settings per session and
database, so you won't even need to explicitly set them for the majority
of transactions?

Regards

Markus

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#8)
Re: Transaction-controlled robustness for replication

On Sat, 2008-07-26 at 10:17 +0200, Markus Wanner wrote:

Expensive as in we need to parse and handle each statement

separately.

If we have a single parameter then much lower overhead.

Is that really much of a concern when otherwise caring about network
and i/o latency?

I believe so. Jens-Wolfhard has provided the solution it seems.

And what about sane default settings per session and
database, so you won't even need to explicitly set them for the
majority of transactions?

Session pools.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#8)
Re: Transaction-controlled robustness for replication

On Sat, 2008-07-26 at 10:17 +0200, Markus Wanner wrote:

What I still don't understand is, why you are speaking about
"logical"
replication. It rather sounds like an ordinary log shipping approach,
where the complete WAL is sent over the wire. Nothing wrong with
that,
it certainly fits many needs and I welcome such a solution for
Postgres.

Yes, first stage is log shipping. Second stage is either physical or
logical apply.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#11Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#1)
Re: Transaction-controlled robustness for replication

Simon Riggs wrote:

We can generalise this as three closed questions, answered either Yes
(Synchronous) or No (Asynchronous)

* Does WAL get forced to disk on primary at commit time?
* Does WAL get forced across link to standby at commit time?
* Does WAL get forced to disk on standby at commit time?

In code, these are simple if tests: Do we wait, or not?

We could represent this with 3 parameters:
synchronous_commit = on | off
synchronous_standby_transfer = on | off
synchronous_standby_wal_fsync = on | off

...

Or we could use just two parameters:
synchronous_commit = on | off
synchronous_replication = 'AA', 'SA' or 'SS'
with A = Asynchronous, S = Synchronous
which corresponds with DRBD's algorithms like this
DRBD A = AA
DRBD B = SA
DRBD C = SS

Or we could use just a single parameter
synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no
log-based replication is defined

First, I love this kind of control.

Second, I think our current control for the main/master are sufficient
that we don't need to add additional variables for this, so we can focus
just on whether we wait for the slave.

Third, we can control whether the slave WAL fsync's by setting things on
the slave, so I don't think we need a master setting for that.

Fourth, I don't think the boolean idea will work because there are
boolean combinations that are illogical, like having
synchronous_standby_transfer off and synchronous_standby_wal_fsync on.

I think you need to make it an enumerated type like log_min_messages;
something like:

wal_transfer_wait

with values of:

nothing: have network traffic send WAL as needed
netflush: wait for flush of WAL network packets to slave
process: wait for slave to process WAL traffic and
optionally fsync

The 'process' option either waits for fsync on the slave or not
depending on how the slave is configured, meaning you could use
synchronous_commit off to not wait for the fsync to disk.

I think we can add a table in the documention to show how to set things
up in postgresql.conf for async master and async slave, and other
combinations.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#12Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#10)
Re: Transaction-controlled robustness for replication

Simon Riggs wrote:

On Sat, 2008-07-26 at 10:17 +0200, Markus Wanner wrote:

What I still don't understand is, why you are speaking about
"logical"
replication. It rather sounds like an ordinary log shipping approach,
where the complete WAL is sent over the wire. Nothing wrong with
that,
it certainly fits many needs and I welcome such a solution for
Postgres.

Yes, first stage is log shipping. Second stage is either physical or
logical apply.

What is the attraction of logical application of the WAL logs?
Transmitting to a server with different architecture?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#12)
Re: Transaction-controlled robustness for replication

On Tue, 2008-08-12 at 11:52 -0400, Bruce Momjian wrote:

Simon Riggs wrote:

On Sat, 2008-07-26 at 10:17 +0200, Markus Wanner wrote:

What I still don't understand is, why you are speaking about
"logical"
replication. It rather sounds like an ordinary log shipping approach,
where the complete WAL is sent over the wire. Nothing wrong with
that,
it certainly fits many needs and I welcome such a solution for
Postgres.

Yes, first stage is log shipping. Second stage is either physical or
logical apply.

What is the attraction of logical application of the WAL logs?
Transmitting to a server with different architecture?

Yes,

* different release
* different encoding
* different CPU architecture
* (with the correct transform) a different DBMS

So logical apply can provide a route for data transfer between
applications, not just replication for DR or HA.

Physical apply works and will be more performant, but it will always be
more restrictive. So there are arguments for doing it both ways.

I believe that we can and should offer both options to provide customer
choice. Ideally, it would be nice to be able to switch between the two
without significant reconfiguration, but that's definitely not for this
release.

(Pragmatically, implementation will be limited by my funding.)

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#14Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#13)
Re: Transaction-controlled robustness for replication

Simon Riggs wrote:

What is the attraction of logical application of the WAL logs?
Transmitting to a server with different architecture?

Yes,

* different release
* different encoding
* different CPU architecture
* (with the correct transform) a different DBMS

So logical apply can provide a route for data transfer between
applications, not just replication for DR or HA.

Physical apply works and will be more performant, but it will always be
more restrictive. So there are arguments for doing it both ways.

I believe that we can and should offer both options to provide customer
choice. Ideally, it would be nice to be able to switch between the two
without significant reconfiguration, but that's definitely not for this
release.

I assume you are focusing on physical first, becuase that is easiest.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#15Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#11)
Re: Transaction-controlled robustness for replication

On Tue, 2008-08-12 at 11:51 -0400, Bruce Momjian wrote:

I think you need to make it an enumerated type like log_min_messages;
something like:

wal_transfer_wait

Yeh, that way sounds best and I like name.

with values of:

nothing: have network traffic send WAL as needed
netflush: wait for flush of WAL network packets to slave
process: wait for slave to process WAL traffic and
optionally fsync

Suggest
async
syncnet
syncdisk

The 'process' option either waits for fsync on the slave or not
depending on how the slave is configured, meaning you could use
synchronous_commit off to not wait for the fsync to disk.

Hmmm, not sure that flicking a switch on the standby should cause a loss
of performance on the master. That will be an accident waiting to
happen. Best to make all things that effect the performance/robustness
of the master be configuration options on the master side.

I think we can add a table in the documention to show how to set things
up in postgresql.conf for async master and async slave, and other
combinations.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#16Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#15)
Re: Transaction-controlled robustness for replication

Simon Riggs wrote:

On Tue, 2008-08-12 at 11:51 -0400, Bruce Momjian wrote:

I think you need to make it an enumerated type like log_min_messages;
something like:

wal_transfer_wait

Yeh, that way sounds best and I like name.

with values of:

nothing: have network traffic send WAL as needed
netflush: wait for flush of WAL network packets to slave
process: wait for slave to process WAL traffic and
optionally fsync

Suggest
async
syncnet
syncdisk

I think the first two are fine, but 'syncdisk' might be wrong if the slave
has 'synchronous_commit = off'. Any ideas?

The 'process' option either waits for fsync on the slave or not
depending on how the slave is configured, meaning you could use
synchronous_commit off to not wait for the fsync to disk.

Hmmm, not sure that flicking a switch on the standby should cause a loss
of performance on the master. That will be an accident waiting to
happen. Best to make all things that effect the performance/robustness
of the master be configuration options on the master side.

Well, we are not removing synchronous_commit so a slave certainly could
be configured to be async, so I don't see that making the master set this
makes any sense, unless you want the master to somehow override that.

So, I think there are three options, async, syncnet, and perhaps
'commit', meaning the slave committed the transactions. That can be
sync slave or async slave, based on the configuration of the slave.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#16)
Re: Transaction-controlled robustness for replication

On Tue, 2008-08-12 at 12:54 -0400, Bruce Momjian wrote:

Simon Riggs wrote:

On Tue, 2008-08-12 at 11:51 -0400, Bruce Momjian wrote:

I think you need to make it an enumerated type like log_min_messages;
something like:

wal_transfer_wait

Yeh, that way sounds best and I like name.

with values of:

nothing: have network traffic send WAL as needed
netflush: wait for flush of WAL network packets to slave
process: wait for slave to process WAL traffic and
optionally fsync

Suggest
async
syncnet
syncdisk

I think the first two are fine, but 'syncdisk' might be wrong if the slave
has 'synchronous_commit = off'. Any ideas?

Yes, synchronous_commit can be set in the postgresql.conf, but its great
advantage is it is a userset parameter.

The main point of the post is that the parameter would be transaction
controlled, so *must* be set in the transaction and thus *must* be set
on the master. Otherwise the capability is not available in the way I am
describing.

synchronous_commit applies to transaction commits. The code path would
be completely different here, so having parameter passed as an info byte
from master will not cause code structure problems or performance
problems.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#18Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#17)
Re: Transaction-controlled robustness for replication

Simon Riggs wrote:

with values of:

nothing: have network traffic send WAL as needed
netflush: wait for flush of WAL network packets to slave
process: wait for slave to process WAL traffic and
optionally fsync

Suggest
async
syncnet
syncdisk

I think the first two are fine, but 'syncdisk' might be wrong if the slave
has 'synchronous_commit = off'. Any ideas?

Yes, synchronous_commit can be set in the postgresql.conf, but its great
advantage is it is a userset parameter.

The main point of the post is that the parameter would be transaction
controlled, so *must* be set in the transaction and thus *must* be set
on the master. Otherwise the capability is not available in the way I am
describing.

Oh, so synchronous_commit would not control WAL sync on the slave? What
about our fsync parameter? Because the slave is read-only, I saw no
disadvantage of setting synchronous_commit to off in postgresql.conf on
the slave.

synchronous_commit applies to transaction commits. The code path would
be completely different here, so having parameter passed as an info byte
from master will not cause code structure problems or performance
problems.

OK, I was just trying to simplify it. The big problem with an async
slave is that not only would you have lost data in a failover, but the
database might be inconsistent, like fsync = off, which is something I
think we want to try to avoid, which is why I was suggesting
synchronous_commit = off.

Or were you thinking of always doing fsync on the slave, no matter what.
I am worried the slave might not be able to keep up (being
single-threaded) and therefore we should allow a way to async commit on
the slave. Certainly if the master is async sending the data, there is
no need to do a synchronous_commit on the slave.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#13)
Re: Transaction-controlled robustness for replication

Simon Riggs <simon@2ndquadrant.com> writes:

On Tue, 2008-08-12 at 11:52 -0400, Bruce Momjian wrote:

What is the attraction of logical application of the WAL logs?
Transmitting to a server with different architecture?

Yes,

* different release
* different encoding
* different CPU architecture
* (with the correct transform) a different DBMS

The notion that the WAL logs will ever be portable across such
differences is so ... so ... well, it's barely worth laughing at.

regards, tom lane

#20Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#19)
Re: Transaction-controlled robustness for replication

On Tue, 2008-08-12 at 15:40 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On Tue, 2008-08-12 at 11:52 -0400, Bruce Momjian wrote:

What is the attraction of logical application of the WAL logs?
Transmitting to a server with different architecture?

Yes,

* different release
* different encoding
* different CPU architecture
* (with the correct transform) a different DBMS

The notion that the WAL logs will ever be portable across such
differences is so ... so ... well, it's barely worth laughing at.

I expect to begin discussion of how that might be achieved in a few
days. I understand the starting place for your thinking, but am not
deterred because I see some solutions. I feel certain you will point out
some difficulties, but I think it is worth doing. Please lets start
those discussions from a neutral point. If you find a real showstopper,
then so be it. Sorry for the delay.

Bruce was asking about why I might want that. If we can at least agree
there is a use case then it helps.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#21Robert Hodges
robert.hodges@continuent.com
In reply to: Tom Lane (#19)
#22Markus Wanner
markus@bluegap.ch
In reply to: Robert Hodges (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Markus Wanner (#22)
#24Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#23)
#25Robert Hodges
robert.hodges@continuent.com
In reply to: Tom Lane (#23)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#24)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#26)
#28Markus Wanner
markus@bluegap.ch
In reply to: Robert Hodges (#25)
#29Markus Wanner
markus@bluegap.ch
In reply to: Alvaro Herrera (#24)
#30Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#18)
#31Markus Wanner
markus@bluegap.ch
In reply to: Robert Hodges (#25)
#32Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#31)
#33Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#32)
#34Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#33)
#35Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#34)
#36Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#30)
#37Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#35)
#38Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#36)
#39Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#37)
#40Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#38)
#41Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#39)