Request for further clarification on synchronous_commit
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/12/runtime-config-wal.html
Description:
Hello,
First of all I would like to say that PostgreSQL has the best documentation
I've ever seen. It is very clear and comprehensive. That's the main reason
why I decided to add my little 2 cents and make it even better.
I think that the distinction between first three values of
synchronous_commit parameter is not clear enough
(https://www.postgresql.org/docs/12/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS).
" When set to on, commits will wait until replies from the current
synchronous standby(s) indicate they have received the commit record of the
transaction and flushed it to disk. This ensures the transaction will not be
lost unless both the primary and all synchronous standbys suffer corruption
of their database storage. When set to remote_apply, commits will wait until
replies from the current synchronous standby(s) indicate they have received
the commit record of the transaction and applied it, so that it has become
visible to queries on the standby(s). When set to remote_write, commits will
wait until replies from the current synchronous standby(s) indicate they
have received the commit record of the transaction and written it out to
their operating system. This setting is sufficient to ensure data
preservation even if a standby instance of PostgreSQL were to crash, but not
if the standby suffers an operating-system-level crash, since the data has
not necessarily reached stable storage on the standby"
The last sentence : "This setting is sufficient to ensure data preservation
even if a standby instance..." seems to refer only to the remote_write
option while in my option it should refer to both remote_write and
remote_apply options, as the fsync is performed only when synchronous_commit
is set to ON.
In other words I think that the documentation should be more clear in terms
of which option uses fsync.
Best regards,
Kasper Kondzielski
On Fri, Aug 14, 2020 at 01:32:35PM +0000, PG Doc comments form wrote:
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/12/runtime-config-wal.html
Description:Hello,
First of all I would like to say that PostgreSQL has the best documentation
I've ever seen. It is very clear and comprehensive. That's the main reason
why I decided to add my little 2 cents and make it even better.I think that the distinction between first three values of
synchronous_commit parameter is not clear enough
(https://www.postgresql.org/docs/12/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS)." When set to on, commits will wait until replies from the current
synchronous standby(s) indicate they have received the commit record of the
transaction and flushed it to disk. This ensures the transaction will not be
lost unless both the primary and all synchronous standbys suffer corruption
of their database storage. When set to remote_apply, commits will wait until
replies from the current synchronous standby(s) indicate they have received
the commit record of the transaction and applied it, so that it has become
visible to queries on the standby(s). When set to remote_write, commits will
wait until replies from the current synchronous standby(s) indicate they
have received the commit record of the transaction and written it out to
their operating system. This setting is sufficient to ensure data
preservation even if a standby instance of PostgreSQL were to crash, but not
if the standby suffers an operating-system-level crash, since the data has
not necessarily reached stable storage on the standby"The last sentence : "This setting is sufficient to ensure data preservation
even if a standby instance..." seems to refer only to the remote_write
option while in my option it should refer to both remote_write and
remote_apply options, as the fsync is performed only when synchronous_commit
is set to ON.In other words I think that the documentation should be more clear in terms
of which option uses fsync.
I think this paragraph just has just too complex. I have moved the
mention of remote_apply into its own paragraph, and simplified the
sentences about remote_write. Is this attached patch better?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
Attachments:
sync.difftext/x-diff; charset=us-asciiDownload+19-15
Hi, thanks for the reply.
To be honest I don't think it is better. Previously paragraph about
remote_apply was after paragraph about `on` and before remote_write which
followed natural order in terms of how strict these parameters are (i.e.
how strong are the guarantees they provide). Because of that I think that
remote_apply should return to its previous position.
My original concern was about the fact that the difference between `on`,
remote_write and remote_apply wasn't perfectly clear.
I am not sure if I understand this difference correctly but maybe such a
table could be helpful to me and others:
+-----------------------------+-------------------------------------------+
| | synchronous_commit |
+-----------------------------+-----+--------------+--------------+-------+
| operation on standby server | on | remote_apply | remote_write | local |
+-----------------------------+-----+--------------+--------------+-------+
| written to WAL | Yes | Yes | Yes | No |
+-----------------------------+-----+--------------+--------------+-------+
| commit transaction | Yes | Yes | No | No |
+-----------------------------+-----+--------------+--------------+-------+
| fsync | Yes | No | No | No |
+-----------------------------+-----+--------------+--------------+-------+
From which we can clearly see that only `on` option guarantees fsync, and
the only difference between remote_write and remote_apply is the visibility
of transaction results to the queries.
Also when it comes to the content of your reply, I have few questions:
+ Finally, when set to <literal>remote_apply</literal>, commits will
+ wait until replies from the current synchronous standby(s)
indicate
+ they have received the commit record of the transaction and
applied
+ it, so that it has become visible to queries on the standby(s).
+ This can cause much larger commit delays than previous settings
+ since it involves WAL replay.
'This can cause much' - What does it mean that it can cause? Under what
circumstances it will/won't cause it?
"since it involves WAL replay" - What is a WAL replay?
Best regards,
Kasper Kondzielski
pon., 17 sie 2020 o 19:47 Bruce Momjian <bruce@momjian.us> napisał(a):
Show quoted text
On Fri, Aug 14, 2020 at 01:32:35PM +0000, PG Doc comments form wrote:
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/12/runtime-config-wal.html
Description:Hello,
First of all I would like to say that PostgreSQL has the best
documentation
I've ever seen. It is very clear and comprehensive. That's the main
reason
why I decided to add my little 2 cents and make it even better.
I think that the distinction between first three values of
synchronous_commit parameter is not clear enough
(https://www.postgresql.org/docs/12/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
)." When set to on, commits will wait until replies from the current
synchronous standby(s) indicate they have received the commit record ofthe
transaction and flushed it to disk. This ensures the transaction will
not be
lost unless both the primary and all synchronous standbys suffer
corruption
of their database storage. When set to remote_apply, commits will wait
until
replies from the current synchronous standby(s) indicate they have
received
the commit record of the transaction and applied it, so that it has
become
visible to queries on the standby(s). When set to remote_write, commits
will
wait until replies from the current synchronous standby(s) indicate they
have received the commit record of the transaction and written it out to
their operating system. This setting is sufficient to ensure data
preservation even if a standby instance of PostgreSQL were to crash, butnot
if the standby suffers an operating-system-level crash, since the data
has
not necessarily reached stable storage on the standby"
The last sentence : "This setting is sufficient to ensure data
preservation
even if a standby instance..." seems to refer only to the remote_write
option while in my option it should refer to both remote_write and
remote_apply options, as the fsync is performed only whensynchronous_commit
is set to ON.
In other words I think that the documentation should be more clear in
terms
of which option uses fsync.
I think this paragraph just has just too complex. I have moved the
mention of remote_apply into its own paragraph, and simplified the
sentences about remote_write. Is this attached patch better?--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.comThe usefulness of a cup is in its emptiness, Bruce Lee
On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
Hi, thanks for the reply.
To be�honest I don't think it is better. Previously paragraph about
remote_apply was after paragraph about `on` and before remote_write which
followed natural order in terms of how strict these parameters are (i.e. how
strong are the guarantees they provide). Because of that I think that
remote_apply should return to its previous position.
Uh, not really --- see below.
My original concern was about the fact that the difference between `on`,
remote_write and remote_apply wasn't perfectly clear.
I am not sure if I understand this difference correctly but maybe such a table
could be helpful to me and others:+-----------------------------+-------------------------------------------+ | | synchronous_commit | +-----------------------------+-----+--------------+--------------+-------+ | operation on standby server | on | remote_apply | remote_write | local | +-----------------------------+-----+--------------+--------------+-------+ | written to WAL | Yes | Yes | Yes | No | +-----------------------------+-----+--------------+--------------+-------+ | commit transaction | Yes | Yes | No | No | +-----------------------------+-----+--------------+--------------+-------+ | fsync | Yes | No | No | No | +-----------------------------+-----+--------------+--------------+-------+From which we can clearly see that only `on` option guarantees fsync, and the
only difference between remote_write and remote_apply is the visibility of
transaction results to the queries.
Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
fsync. If you want to go in order of severity, with the most severe
first, it is:
remote_apply
on
remote_write
local
This is seen in the C enum ordering for synchronous_commit, but in
reverse order:
typedef enum
{
SYNCHRONOUS_COMMIT_OFF, /* asynchronous commit */
SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
SYNCHRONOUS_COMMIT_REMOTE_WRITE, /* wait for local flush and remote
* write */
SYNCHRONOUS_COMMIT_REMOTE_FLUSH, /* wait for local and remote flush */
SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and remote apply */
} SyncCommitLevel;
and this defines the 'on' behavior:
/* Define the default setting for synchronous_commit */
#define SYNCHRONOUS_COMMIT_ON SYNCHRONOUS_COMMIT_REMOTE_FLUSH
I will clarify this comment, and the docs, to say that remote_apply
includes remote flush.
Obviously these docs need improvement. Updated patch attached. I have
to admit I was kind of confused if remote_apply did remote fsync, but
never had the time to research it until you asked. remote_apply is so
different from the rest, and so heavy, that I put it last in its own
paragraph.
+ � � � � Finally, when set to <literal>remote_apply</literal>, commits will + � � � � wait until replies from the current synchronous standby(s) indicate + � � � � they have received the commit record of the transaction and applied + � � � � it, so that it has become visible to queries on the standby(s). + � � � � This can cause much larger commit delays than previous settings + � � � � since it involves WAL replay. 'This can cause much' - What does it mean that it can cause? Under what circumstances it will/won't cause it?
Uh, I think we can change this to "will cause", because I can't think of
a case where it will not.
"since it involves WAL replay" - What is a WAL replay?�
Well, there is a doc section that talks about WAL:
https://www.postgresql.org/docs/12/wal.html
and other parts of the config docs that talk about WAL.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
Attachments:
sync.difftext/x-diff; charset=us-asciiDownload+26-21
On Tue, Aug 18, 2020 at 10:58:51AM -0400, Bruce Momjian wrote:
Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
fsync. If you want to go in order of severity, with the most severe
first, it is:remote_apply
on
remote_write
localThis is seen in the C enum ordering for synchronous_commit, but in
reverse order:typedef enum
{
SYNCHRONOUS_COMMIT_OFF, /* asynchronous commit */
SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
SYNCHRONOUS_COMMIT_REMOTE_WRITE, /* wait for local flush and remote
* write */
SYNCHRONOUS_COMMIT_REMOTE_FLUSH, /* wait for local and remote flush */
SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and remote apply */
} SyncCommitLevel;
Also, there is some logic to say that the postgresql.conf
synchronous_commit options list should be reordered from:
#synchronous_commit = on # synchronization level;
# off, local, remote_write, remote_apply, or on
to
#synchronous_commit = on # synchronization level;
# off, local, remote_write, on, or remote_apply
I think we should backpatch the doc changes, but maybe not the
postgresql.conf one --- I am not sure.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
Hi, thanks for the reply.
To be honest I don't think it is better. Previously paragraph about
remote_apply was after paragraph about `on` and before remote_write
which
followed natural order in terms of how strict these parameters are
(i.e. how
strong are the guarantees they provide). Because of that I think that
remote_apply should return to its previous position.
Uh, not really --- see below.
Ok, I see, thanks. Shouldn't we then stick to this order whenever possible
(might be sometimes reversed).
So, in the proposed patch I would suggest putting remote_apply first. (Of
course, before that we can mention that the default option is `on`, but
without going to much into the details.)
Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
fsync. If you want to go in order of severity, with the most severe
first, it is:remote_apply
on
remote_write
local
Wouldn't the table be beneficial when it comes to highlighting these
differences?
+-----------------------------+---------------------------------------------------------+
| | synchronous_commit
|
+-----------------------------+--------------+-------------------+--------------+-------+
| operation on standby server | remote_apply | on (remote_flush) |
remote_write | local |
+-----------------------------+--------------+-------------------+--------------+-------+
| write to WAL | Yes | Yes | Yes
| No |
+-----------------------------+--------------+-------------------+--------------+-------+
| fsync | Yes | Yes | No
| No |
+-----------------------------+--------------+-------------------+--------------+-------+
| apply WAL data | Yes | No | No
| No |
+-----------------------------+--------------+-------------------+--------------+-------+
and this defines the 'on' behavior:
/* Define the default setting for synchronous_commit */
#define SYNCHRONOUS_COMMIT_ON SYNCHRONOUS_COMMIT_REMOTE_FLUSH
Is there any valid reason to hide this behavior under `on` alias? In my
opinion `remote_flush` does much better job with describing what it does.
Maybe we could rename `on` to `remote_flush` but also create an alias
`on=remote_flush` to keep backward compatibility?
+ Finally, when set to <literal>remote_apply</literal>, commits
+ will wait until replies from the current synchronous standby(s)
+ indicate they have received the commit record of the transaction
+ and applied it, so that it has become visible to queries on the
+ standby(s), and also written to durable storage on the standbys.
"and also written to durable storage on the standbys." -> You mean flushed?
Maybe it should be better to stick to cohesive terminology to not introduce
any confusion.
Well, there is a doc section that talks about WAL:
https://www.postgresql.org/docs/12/wal.html
and other parts of the config docs that talk about WAL.
Yes, I know what is WAL for. I only don't get what kind of operation do you
mean by 'WAL replay'. The only one thing which I can think of is the
process of restoring database after a crash, when we apply changes from WAL
to the data pages which haven't been flushed to the disk, but I don't think
that this is that. Basically what I wonder is how can a WAL replay
influence the transaction commit?
wt., 18 sie 2020 o 19:17 Bruce Momjian <bruce@momjian.us> napisał(a):
Show quoted text
On Tue, Aug 18, 2020 at 10:58:51AM -0400, Bruce Momjian wrote:
Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
fsync. If you want to go in order of severity, with the most severe
first, it is:remote_apply
on
remote_write
localThis is seen in the C enum ordering for synchronous_commit, but in
reverse order:typedef enum
{
SYNCHRONOUS_COMMIT_OFF, /* asynchronous commit */
SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
SYNCHRONOUS_COMMIT_REMOTE_WRITE, /* wait for local flushand remote
* write */
SYNCHRONOUS_COMMIT_REMOTE_FLUSH, /* wait for local andremote flush */
SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and
remote apply */
} SyncCommitLevel;
Also, there is some logic to say that the postgresql.conf
synchronous_commit options list should be reordered from:#synchronous_commit = on # synchronization level;
# off, local,
remote_write, remote_apply, or onto
#synchronous_commit = on # synchronization level;
# off, local,
remote_write, on, or remote_applyI think we should backpatch the doc changes, but maybe not the
postgresql.conf one --- I am not sure.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.comThe usefulness of a cup is in its emptiness, Bruce Lee
On Wed, Aug 19, 2020 at 11:39:53AM +0200, Kasper Kondzielski wrote:
On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
Hi, thanks for the reply.
To be honest I don't think it is better. Previously paragraph about
remote_apply was after paragraph about `on` and before remote_write which
followed natural order in terms of how strict these parameters are (i.e.how
strong are the guarantees they provide). Because of that I think that
remote_apply should return to its previous position.Uh, not really --- see below.
Ok, I see, thanks. Shouldn't we then stick to this order whenever possible
(might be sometimes reversed).
So, in the proposed patch I would suggest putting remote_apply first. (Of
course, before that we can mention that the default option is `on`, but without
going to much into the details.)
Well, it is kind of confusing. I wanted to put remote_apply in its own
paragraph because it only applies to standbys, and because it is much
heavier and in a different scope (replay) than the others. Frankly,
remote_apply is realated to synchronicity only to the extent it allows
consistent/synchronous results from standbys, not related to syncing
data to the kernel or durable storage.
Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
fsync.� If you want to go in order of severity, with the most severe
first, it is:� � � �remote_apply
� � � �on
� � � �remote_write
� � � �localWouldn't the table be beneficial when it comes to highlighting these
differences?
Uh, I don't think we list a table like this anywhere else for config
options. I would be interested if others think it would be helpful.
+-----------------------------+---------------------------------------------------------+ | | synchronous_commit | +-----------------------------+--------------+-------------------+--------------+-------+ | operation on standby server | remote_apply | on (remote_flush) | remote_write | local | +-----------------------------+--------------+-------------------+--------------+-------+ | write to WAL | Yes | Yes | Yes | No | +-----------------------------+--------------+-------------------+--------------+-------+ | fsync | Yes | Yes | No | No | +-----------------------------+--------------+-------------------+--------------+-------+ | apply WAL data | Yes | No | No | No | +-----------------------------+--------------+-------------------+--------------+-------+and this defines the 'on' behavior:
� � � �/* Define the default setting for synchronous_commit */
� � � �#define SYNCHRONOUS_COMMIT_ON � SYNCHRONOUS_COMMIT_REMOTE_FLUSHIs there any valid reason to hide this behavior under `on` alias? In my opinion
`remote_flush` does much better job with describing what it does. Maybe we
could rename `on` to `remote_flush` but also create an alias `on=remote_flush`
to keep backward compatibility?�
Well, I think we originally only had 'on', and later added the others.
Also, 'on' is also local flush. We don't support local _write_ where we
only write it to the kernel. We support fysync off, which I think is
the local behavior of remote_write. I think remote_write is saying we
want local fsync but no fsync for remote. Is that even correct?
This is certainly confusing. Maybe we do need a chart, but we need to
list local and standby behavior.
+ � � � � Finally, when set to <literal>remote_apply</literal>, commits + � � � � will wait until replies from the current synchronous standby(s) + � � � � indicate they have received the commit record of the transaction + � � � � and applied it, so that it has become visible to queries on the + � � � � standby(s), and also written to durable storage on the standbys."and also written to durable storage on the standbys." -> You mean flushed?
Maybe it should be better to stick to cohesive terminology to not introduce any
confusion.
Yes, I mean written to durable storage. I don't think you can use
"flushed" alone since you could be flusing the WAL buffers to the file
system.
Well, there is a doc section that talks about WAL:
� � � �https://www.postgresql.org/docs/12/wal.html
and other parts of the config docs that talk about WAL.
Yes, I know what is WAL for. I only don't get what kind of operation do you
mean by 'WAL replay'. The only one thing which I can think of is the process of
restoring database after a crash, when we apply changes from WAL to the data
pages which haven't been flushed to the disk, but I don't think that this is
that. Basically what I wonder is how can a WAL replay influence the transaction
commit?
Well, WAL reply is how replication works. Pretty much the same thing
that happens during crash recovery, but it happens continually.
Someone just wrote this blog entry, which I think helps explain what we
are talking about:
How is this for a table?
-- local -- ------------------- standbys ------------------
durable query durable commit durable commit
commit consistency after OS crash after PG crash
remote_apply X X X X
on X X X
remote_write X X
local X
off
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Fri, Aug 21, 2020 at 04:15:39PM -0400, Bruce Momjian wrote:
How is this for a table?
-- local -- ------------------- standbys ------------------
durable query durable commit durable commit
commit consistency after OS crash after PG crash
remote_apply X X X X
on X X X
remote_write X X
local X
off
I have created the attached doc patch which I plan to apply to all
supported versions of Postgres.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
Attachments:
sync.difftext/x-diff; charset=us-asciiDownload+115-31
On Tue, Oct 6, 2020 at 07:38:48PM -0400, Bruce Momjian wrote:
On Fri, Aug 21, 2020 at 04:15:39PM -0400, Bruce Momjian wrote:
How is this for a table?
-- local -- ------------------- standbys ------------------
durable query durable commit durable commit
commit consistency after OS crash after PG crash
remote_apply X X X X
on X X X
remote_write X X
local X
offI have created the attached doc patch which I plan to apply to all
supported versions of Postgres.
I have applied this patch through 9.6. 9.5 didn't have all the options.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
That's awesome, thanks!
czw., 15 paź 2020 o 21:16 Bruce Momjian <bruce@momjian.us> napisał(a):
Show quoted text
On Tue, Oct 6, 2020 at 07:38:48PM -0400, Bruce Momjian wrote:
On Fri, Aug 21, 2020 at 04:15:39PM -0400, Bruce Momjian wrote:
How is this for a table?
-- local -- ------------------- standbys
------------------
durable query durable commit durable
commit
commit consistency after OS crash after
PG crash
remote_apply X X X X
on X X X
remote_write X X
local X
offI have created the attached doc patch which I plan to apply to all
supported versions of Postgres.I have applied this patch through 9.6. 9.5 didn't have all the options.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.comThe usefulness of a cup is in its emptiness, Bruce Lee