Is there a way (except from server logs) to know the kind of on-going/last checkpoint?
Hi,
Currently one can know the kind of on-going/last checkpoint (shutdown,
end-of-recovery, immediate, force etc.) only via server logs that to
when log_checkpoints GUC is on. At times, the users/service layer
components would want to know the kind of checkpoint (along with other
checkpoint related info) to take some actions and it will be a bit
difficult to search through the server logs. The checkpoint info can
be obtained from the control file (either by pg_control_checkpoint()
or by pg_controldata tool) whereas checkpoint kind isn't available
there.
How about we add an extra string field to the control file alongside
the other checkpoint info it already has? This way, the existing
pg_control_checkpoint() or pg_controldata tool can easily be enhanced
to show the checkpoint kind as well. One concern is that we don't want
to increase the size of pg_controldata by more than the typical block
size (of 8K) to avoid any torn-writes. With this change, we might add
at max the strings specified at [1]for checkpoint: "checkpoint shutdown end-of-recovery immediate force wait wal time flush-all" for restartpoint: "restartpoint shutdown end-of-recovery immediate force wait wal time flush-all". Adding it to the control file has
an advantage of preserving the last checkpoint kind which might be
useful.
Thoughts?
[1]: for checkpoint: "checkpoint shutdown end-of-recovery immediate force wait wal time flush-all" for restartpoint: "restartpoint shutdown end-of-recovery immediate force wait wal time flush-all"
force wait wal time flush-all"
for restartpoint: "restartpoint shutdown end-of-recovery immediate
force wait wal time flush-all"
Regards,
Bharath Rupireddy.
On 12/7/21 15:36, Bharath Rupireddy wrote:
Hi,
Currently one can know the kind of on-going/last checkpoint (shutdown,
end-of-recovery, immediate, force etc.) only via server logs that to
when log_checkpoints GUC is on. At times, the users/service layer
components would want to know the kind of checkpoint (along with other
checkpoint related info) to take some actions and it will be a bit
difficult to search through the server logs. The checkpoint info can
be obtained from the control file (either by pg_control_checkpoint()
or by pg_controldata tool) whereas checkpoint kind isn't available
there.How about we add an extra string field to the control file alongside
the other checkpoint info it already has? This way, the existing
pg_control_checkpoint() or pg_controldata tool can easily be enhanced
to show the checkpoint kind as well. One concern is that we don't want
to increase the size of pg_controldata by more than the typical block
size (of 8K) to avoid any torn-writes. With this change, we might add
at max the strings specified at [1]. Adding it to the control file has
an advantage of preserving the last checkpoint kind which might be
useful.Thoughts?
I agree it might be useful to provide information about the nature of
the checkpoint, and perhaps even PID of the backend that triggered it
(although that may be tricky, if the backend terminates).
I'm not sure about adding it to control data, though. That doesn't seem
like a very good match for something that's mostly for monitoring.
We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?
[1] for checkpoint: "checkpoint shutdown end-of-recovery immediate
force wait wal time flush-all"
for restartpoint: "restartpoint shutdown end-of-recovery immediate
force wait wal time flush-all"
I'd bet squashing all of this into a single string (not really a flag)
will just mean people will have to parse it, etc. Keeping individual
boolean flags (or even separate string fields) would be better, I guess.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Dec 07, 2021 at 11:18:37PM +0100, Tomas Vondra wrote:
On 12/7/21 15:36, Bharath Rupireddy wrote:
Currently one can know the kind of on-going/last checkpoint (shutdown,
end-of-recovery, immediate, force etc.) only via server logs that to
when log_checkpoints GUC is on.
The checkpoint info can be obtained from the control file (either by
pg_control_checkpoint() or by pg_controldata tool) whereas checkpoint kind
isn't available there.
How about we add an extra string field to the control file alongside
the other checkpoint info it already has? This way, the existing
pg_control_checkpoint() or pg_controldata tool can easily be enhanced
to show the checkpoint kind as well. One concern is that we don't want
to increase the size of pg_controldata by more than the typical block
size (of 8K) to avoid any torn-writes. With this change, we might add
at max the strings specified at [1]. Adding it to the control file has
an advantage of preserving the last checkpoint kind which might be
useful.
[1] for checkpoint: "checkpoint shutdown end-of-recovery immediate
force wait wal time flush-all"
for restartpoint: "restartpoint shutdown end-of-recovery immediate
force wait wal time flush-all"I'd bet squashing all of this into a single string (not really a flag)
will just mean people will have to parse it, etc. Keeping individual
boolean flags (or even separate string fields) would be better, I guess.
Since the size of controldata is a concern, I suggest to add an int16 to
populate with flags, which pg_controldata can parse for display.
Note that this other patch intends to add the timestamp and LSN of most recent
recovery to controldata.
https://commitfest.postgresql.org/36/3183/
We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?
It could go into the pg_stat_checkpointer view, which would be the culmination
of another patch (cc Melanie).
https://commitfest.postgresql.org/36/3272/
--
Justin
On Wed, Dec 8, 2021 at 3:48 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 12/7/21 15:36, Bharath Rupireddy wrote:
Hi,
Currently one can know the kind of on-going/last checkpoint (shutdown,
end-of-recovery, immediate, force etc.) only via server logs that to
when log_checkpoints GUC is on. At times, the users/service layer
components would want to know the kind of checkpoint (along with other
checkpoint related info) to take some actions and it will be a bit
difficult to search through the server logs. The checkpoint info can
be obtained from the control file (either by pg_control_checkpoint()
or by pg_controldata tool) whereas checkpoint kind isn't available
there.How about we add an extra string field to the control file alongside
the other checkpoint info it already has? This way, the existing
pg_control_checkpoint() or pg_controldata tool can easily be enhanced
to show the checkpoint kind as well. One concern is that we don't want
to increase the size of pg_controldata by more than the typical block
size (of 8K) to avoid any torn-writes. With this change, we might add
at max the strings specified at [1]. Adding it to the control file has
an advantage of preserving the last checkpoint kind which might be
useful.Thoughts?
I agree it might be useful to provide information about the nature of
the checkpoint, and perhaps even PID of the backend that triggered it
(although that may be tricky, if the backend terminates).
Thanks. I agree to have pg_stat_progress_checkpoint and yes PID of the
triggered backend can possibly go there (we can mention in the
documentation that the backend that triggered the checkpoint can get
terminated).
I'm not sure about adding it to control data, though. That doesn't seem
like a very good match for something that's mostly for monitoring.
Having it in the control data file (along with the existing checkpoint
information) will be helpful to know what was the last checkpoint
information and we can use the existing pg_control_checkpoint function
or the tool to emit that info. I plan to add an int16 flag as
suggested by Justin in this thread and come up with a patch.
We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?
+1 to have pg_stat_progress_checkpoint view to know what's going on
with the current checkpoint.
Regards,
Bharath Rupireddy.
On Wed, Dec 8, 2021 at 4:07 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
I'd bet squashing all of this into a single string (not really a flag)
will just mean people will have to parse it, etc. Keeping individual
boolean flags (or even separate string fields) would be better, I guess.Since the size of controldata is a concern, I suggest to add an int16 to
populate with flags, which pg_controldata can parse for display.
+1. I will come up with a patch soon.
Note that this other patch intends to add the timestamp and LSN of most recent
recovery to controldata.
https://commitfest.postgresql.org/36/3183/
Thanks. I will go through it separately.
We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?It could go into the pg_stat_checkpointer view, which would be the culmination
of another patch (cc Melanie).
https://commitfest.postgresql.org/36/3272/
+1 to have pg_stat_progress_checkpoint view. We have
CheckpointStatsData already. What we need is to make this structure
shared and add a little more info to represent the progress, so that
the other backends can access it. I think we can discuss this in a
separate thread to give it a fresh try rather than proposing this as a
part of another thread. I will spend some time on
pg_stat_progress_checkpoint proposal and try to come up with a
separate thread to discuss.
Regards,
Bharath Rupireddy.
On 12/8/21 02:54, Bharath Rupireddy wrote:
On Wed, Dec 8, 2021 at 3:48 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 12/7/21 15:36, Bharath Rupireddy wrote:
Hi,
Currently one can know the kind of on-going/last checkpoint (shutdown,
end-of-recovery, immediate, force etc.) only via server logs that to
when log_checkpoints GUC is on. At times, the users/service layer
components would want to know the kind of checkpoint (along with other
checkpoint related info) to take some actions and it will be a bit
difficult to search through the server logs. The checkpoint info can
be obtained from the control file (either by pg_control_checkpoint()
or by pg_controldata tool) whereas checkpoint kind isn't available
there.How about we add an extra string field to the control file alongside
the other checkpoint info it already has? This way, the existing
pg_control_checkpoint() or pg_controldata tool can easily be enhanced
to show the checkpoint kind as well. One concern is that we don't want
to increase the size of pg_controldata by more than the typical block
size (of 8K) to avoid any torn-writes. With this change, we might add
at max the strings specified at [1]. Adding it to the control file has
an advantage of preserving the last checkpoint kind which might be
useful.Thoughts?
I agree it might be useful to provide information about the nature of
the checkpoint, and perhaps even PID of the backend that triggered it
(although that may be tricky, if the backend terminates).Thanks. I agree to have pg_stat_progress_checkpoint and yes PID of the
triggered backend can possibly go there (we can mention in the
documentation that the backend that triggered the checkpoint can get
terminated).
My concern is someone might run something that requires a checkpoint, so
we start it and put the PID into the catalog. And then the person aborts
the command and starts doing something else. But that does not abort the
checkpoint, but the backend now runs something that doesn't requite
checkpoint, which is rather confusing.
I'm not sure about adding it to control data, though. That doesn't seem
like a very good match for something that's mostly for monitoring.Having it in the control data file (along with the existing checkpoint
information) will be helpful to know what was the last checkpoint
information and we can use the existing pg_control_checkpoint function
or the tool to emit that info. I plan to add an int16 flag as
suggested by Justin in this thread and come up with a patch.
OK, although I'm not sure it's all that useful (if we have that in some
sort of system view).
We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?+1 to have pg_stat_progress_checkpoint view to know what's going on
with the current checkpoint.
Do you plan to add it to this patch, or should it be a separate patch?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Dec 8, 2021 at 7:34 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
I agree it might be useful to provide information about the nature of
the checkpoint, and perhaps even PID of the backend that triggered it
(although that may be tricky, if the backend terminates).Thanks. I agree to have pg_stat_progress_checkpoint and yes PID of the
triggered backend can possibly go there (we can mention in the
documentation that the backend that triggered the checkpoint can get
terminated).My concern is someone might run something that requires a checkpoint, so
we start it and put the PID into the catalog. And then the person aborts
the command and starts doing something else. But that does not abort the
checkpoint, but the backend now runs something that doesn't requite
checkpoint, which is rather confusing.I'm not sure about adding it to control data, though. That doesn't seem
like a very good match for something that's mostly for monitoring.Having it in the control data file (along with the existing checkpoint
information) will be helpful to know what was the last checkpoint
information and we can use the existing pg_control_checkpoint function
or the tool to emit that info. I plan to add an int16 flag as
suggested by Justin in this thread and come up with a patch.OK, although I'm not sure it's all that useful (if we have that in some
sort of system view).
If the server is down, the control file will help. Since we already
have the other checkpoint info in the control file, it's much more
useful and sensible to have this extra piece of missing information
(checkpoint kind) there.
We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?+1 to have pg_stat_progress_checkpoint view to know what's going on
with the current checkpoint.Do you plan to add it to this patch, or should it be a separate patch?
No, I will put some more thoughts around it and start a new thread.
Regards,
Bharath Rupireddy.
On 12/8/21 02:54, Bharath Rupireddy wrote:
On Wed, Dec 8, 2021 at 4:07 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
I'd bet squashing all of this into a single string (not really a flag)
will just mean people will have to parse it, etc. Keeping individual
boolean flags (or even separate string fields) would be better, I guess.Since the size of controldata is a concern, I suggest to add an int16 to
populate with flags, which pg_controldata can parse for display.+1. I will come up with a patch soon.
Note that this other patch intends to add the timestamp and LSN of most recent
recovery to controldata.
https://commitfest.postgresql.org/36/3183/Thanks. I will go through it separately.
We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?It could go into the pg_stat_checkpointer view, which would be the culmination
of another patch (cc Melanie).
https://commitfest.postgresql.org/36/3272/
I don't think the pg_stat_checkpointer would be a good match - that's
going to be an aggregated view of all past checkpoints, not a good
source info about the currently running one.
+1 to have pg_stat_progress_checkpoint view. We have
CheckpointStatsData already. What we need is to make this structure
shared and add a little more info to represent the progress, so that
the other backends can access it. I think we can discuss this in a
separate thread to give it a fresh try rather than proposing this as a
part of another thread. I will spend some time on
pg_stat_progress_checkpoint proposal and try to come up with a
separate thread to discuss.
+1 to discuss it as part of this patch
I'm not sure whether the view should look at CheckpointStatsData, or do
the same thing as the other pg_stat_progress_* views - send the data to
stat collector, and read it from there.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Dec 8, 2021 at 7:43 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Wed, Dec 8, 2021 at 7:34 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:I agree it might be useful to provide information about the nature of
the checkpoint, and perhaps even PID of the backend that triggered it
(although that may be tricky, if the backend terminates).I'm not sure about adding it to control data, though. That doesn't seem
like a very good match for something that's mostly for monitoring.Having it in the control data file (along with the existing checkpoint
information) will be helpful to know what was the last checkpoint
information and we can use the existing pg_control_checkpoint function
or the tool to emit that info. I plan to add an int16 flag as
suggested by Justin in this thread and come up with a patch.OK, although I'm not sure it's all that useful (if we have that in some
sort of system view).If the server is down, the control file will help. Since we already
have the other checkpoint info in the control file, it's much more
useful and sensible to have this extra piece of missing information
(checkpoint kind) there.
Here's the patch that adds the last checkpoint kind to the control
file, displayed as an output in the pg_controldata tool and also
exposes it via the pg_control_checkpoint function.
I will analyze and work on the idea of pg_stat_progress_checkpoint and
try to post the patch here in this thread.
Regards,
Bharath Rupireddy.
Attachments:
v1-0001-add-last-checkpoint-kind-to-pg_control-file.patchapplication/octet-stream; name=v1-0001-add-last-checkpoint-kind-to-pg_control-file.patchDownload+54-7
On Thu, Dec 9, 2021 at 12:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Wed, Dec 8, 2021 at 7:43 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Wed, Dec 8, 2021 at 7:34 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:I agree it might be useful to provide information about the nature of
the checkpoint, and perhaps even PID of the backend that triggered it
(although that may be tricky, if the backend terminates).I'm not sure about adding it to control data, though. That doesn't seem
like a very good match for something that's mostly for monitoring.Having it in the control data file (along with the existing checkpoint
information) will be helpful to know what was the last checkpoint
information and we can use the existing pg_control_checkpoint function
or the tool to emit that info. I plan to add an int16 flag as
suggested by Justin in this thread and come up with a patch.OK, although I'm not sure it's all that useful (if we have that in some
sort of system view).If the server is down, the control file will help. Since we already
have the other checkpoint info in the control file, it's much more
useful and sensible to have this extra piece of missing information
(checkpoint kind) there.Here's the patch that adds the last checkpoint kind to the control
file, displayed as an output in the pg_controldata tool and also
exposes it via the pg_control_checkpoint function.
Here's v2, rebased onto the latest master.
Regards,
Bharath Rupireddy.
Attachments:
v2-0001-add-last-checkpoint-kind-to-pg_control-file.patchapplication/octet-stream; name=v2-0001-add-last-checkpoint-kind-to-pg_control-file.patchDownload+56-9
Hi all,
Here's v2, rebased onto the latest master.
I've reviewed this patch. The patch builds against the master (commit
e9d4001ec592bcc9a3332547cb1b0211e8794f38) and passes all the tests.
The patch does what it intends to do, namely store the kind of the
last checkpoint in the control file and display it in the output of
the pg_control_checkpoint() function and pg_controldata utility.
I did not test it with restartpoints though. Speaking of the torn
writes, the size of the control file with this patch applied does not
exceed 8Kb.
A few code comments:
+ char ckpt_kind[2 * MAXPGPATH];
I don't completely understand why 2 * MAXPGPATH is used here for the
buffer size.
[1]: https://github.com/postgres/postgres/blob/410aa248e5a883fde4832999cc9b23c7ace0f2ff/src/include/pg_config_manual.h#L106
does it relate to ckpt_kind ?
+ ControlFile->checkPointKind = 0;
It is worth a comment that 0 is unknown, as for instance in [2]https://github.com/postgres/postgres/blob/410aa248e5a883fde4832999cc9b23c7ace0f2ff/src/interfaces/libpq/fe-exec.c#L1078
+ (flags == 0) ? "unknown" : "",
That reads as if this patch would introduce a new "unknown" checkpoint state.
Why have it here at all if after for example initdb the kind is "shutdown" ?
+ snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s",
+ (flags == 0) ? "unknown" : "",
+ (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",
+ (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "",
+ (flags & CHECKPOINT_IMMEDIATE) ? "immediate " : "",
+ (flags & CHECKPOINT_FORCE) ? "force " : "",
+ (flags & CHECKPOINT_WAIT) ? "wait " : "",
+ (flags & CHECKPOINT_CAUSE_XLOG) ? "wal " : "",
+ (flags & CHECKPOINT_CAUSE_TIME) ? "time " : "",
+ (flags & CHECKPOINT_FLUSH_ALL) ? "flush-all" : "");
The space at the strings' end (as in "wait " or "immediate ")
introduces extra whitespace in the output of pg_control_checkpoint().
A similar check at [3]https://github.com/postgres/postgres/blob/27b77ecf9f4d5be211900eda54d8155ada50d696/src/backend/access/transam/xlog.c#L8851-L8859 places whitespace differently; that arrangement
of whitespace should remove the issue.
[1]: https://github.com/postgres/postgres/blob/410aa248e5a883fde4832999cc9b23c7ace0f2ff/src/include/pg_config_manual.h#L106
[2]: https://github.com/postgres/postgres/blob/410aa248e5a883fde4832999cc9b23c7ace0f2ff/src/interfaces/libpq/fe-exec.c#L1078
[3]: https://github.com/postgres/postgres/blob/27b77ecf9f4d5be211900eda54d8155ada50d696/src/backend/access/transam/xlog.c#L8851-L8859
Regards,
Sergey Dudoladov
Hi,
On Thu, Jan 27, 2022 at 10:53:32AM +0100, Sergey Dudoladov wrote:
Hi all,
Here's v2, rebased onto the latest master.
I've reviewed this patch. The patch builds against the master (commit
e9d4001ec592bcc9a3332547cb1b0211e8794f38) and passes all the tests.
The patch does what it intends to do, namely store the kind of the
last checkpoint in the control file and display it in the output of
the pg_control_checkpoint() function and pg_controldata utility.
I don't agree with that.
What it's showing is the "currently ongoing checkpoint or last completed
checkpoint" kind. It's still not possible to know if a checkpoint is in
progress or not and any kind of information related to it, so I'm not sure how
useful this will be compared to a checkpoint progress view.
Also, it's only showing the initial triggering conditions of checkpoints.
For instance, if a timed checkpoint is started and then a backend executes a
"CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
AFAICS those new flags won't be saved to the control file.
On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
What it's showing is the "currently ongoing checkpoint or last completed
checkpoint" kind.
Ah after double checking I see it's storing the information *after* the
checkpoint completion, so it's indeed the last completed checkpoint. I'm not
sure how useful it can be, but ok.
Also, it's only showing the initial triggering conditions of checkpoints.
For instance, if a timed checkpoint is started and then a backend executes a
"CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
AFAICS those new flags won't be saved to the control file.
This one is still valid I think, it's only storing the initial flags and not
the possibly upgraded one in shmem.
At Thu, 27 Jan 2022 19:09:29 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
What it's showing is the "currently ongoing checkpoint or last completed
checkpoint" kind.Ah after double checking I see it's storing the information *after* the
checkpoint completion, so it's indeed the last completed checkpoint. I'm not
sure how useful it can be, but ok.
I don't see it useful (but don't oppose) to record checkpoint kind in
control file. It is a kind of realtime noncritical info and in the
first place retrievable from server log if needed. And I'm skeptical
that it is needed such frequently. Checkpoint kind is useful to check
max_wal_size's appropriateness if it is in a summarized form as in
pg_stat_bgwriter. On the other hand showing the same in a stats view
or the output of pg_control_checkpoint() is fine by me.
Also, it's only showing the initial triggering conditions of checkpoints.
For instance, if a timed checkpoint is started and then a backend executes a
"CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
AFAICS those new flags won't be saved to the control file.This one is still valid I think, it's only storing the initial flags and not
the possibly upgraded one in shmem.
Agreed.
I don't like to add this complex-but-need-in-sync blob twice. If we
need to do that twice, I want them consolidated in any shape.
Datum values[18];
bool nulls[18];
You forgot to expand these arrays.
This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
and pg_upgrade need to treat the change.
Even if we add checkpoint kind to control file, it would look a bit
strange that the "checkpoint kind" shows first among all
checkpoint-related lines. And at least the "wait" in the line is
really useless since it is not a property of a checkpoint. Instead, it
doesn't show "requested" which is one of the checkpoint properties
like "xlog" and "time". I'm not sure we need all of the properties to
be shown but I don't have a clear criteria for each property of it
ought to be shown or not.
pg_control last modified: Fri 28 Jan 2022 09:49:46 AM JST
Latest checkpoint kind: immediate force wait
Latest checkpoint location: 0/172B2C8
I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...
- proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}',
+ proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',
I think the additional column should be text[] instead of text, but
not sure.
And you need to edit the corresponding part of the doc.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Sorry, the last message lacks one citation.
At Thu, 27 Jan 2022 19:09:29 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
What it's showing is the "currently ongoing checkpoint or last completed
checkpoint" kind.Ah after double checking I see it's storing the information *after* the
checkpoint completion, so it's indeed the last completed checkpoint. I'm not
sure how useful it can be, but ok.
I don't see it useful (but don't oppose) to record checkpoint kind in
control file. It is a kind of realtime noncritical info and in the
first place retrievable from server log if needed. And I'm skeptical
that it is needed such frequently. Checkpoint kind is useful to check
max_wal_size's appropriateness if it is in a summarized form as in
pg_stat_bgwriter. On the other hand showing the same in a stats view
or the output of pg_control_checkpoint() is fine by me.
Also, it's only showing the initial triggering conditions of checkpoints.
For instance, if a timed checkpoint is started and then a backend executes a
"CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
AFAICS those new flags won't be saved to the control file.This one is still valid I think, it's only storing the initial flags and not
the possibly upgraded one in shmem.
Agreed.
+ snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s",
+ (flags == 0) ? "unknown" : "",
+ (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",
+ (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "",
+ (flags & CHECKPOINT_IMMEDIATE) ? "immediate " : "",
+ (flags & CHECKPOINT_FORCE) ? "force " : "",
+ (flags & CHECKPOINT_WAIT) ? "wait " : "",
+ (flags & CHECKPOINT_CAUSE_XLOG) ? "wal " : "",
+ (flags & CHECKPOINT_CAUSE_TIME) ? "time " : "",
+ (flags & CHECKPOINT_FLUSH_ALL) ? "flush-all" : "");
I don't like to add this complex-but-need-in-sync blob twice. If we
need to do that twice, I want them consolidated in any shape.
+ Datum values[18];
+ bool nulls[18];
You forgot to expand these arrays.
This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
and pg_upgrade need to treat the change.
Even if we add checkpoint kind to control file, it would look a bit
strange that the "checkpoint kind" shows first among all
checkpoint-related lines. And at least the "wait" in the line is
really useless since it is not a property of a checkpoint. Instead, it
doesn't show "requested" which is one of the checkpoint properties
like "xlog" and "time". I'm not sure we need all of the properties to
be shown but I don't have a clear criteria for each property of it
ought to be shown or not.
pg_control last modified: Fri 28 Jan 2022 09:49:46 AM JST
Latest checkpoint kind: immediate force wait
Latest checkpoint location: 0/172B2C8
I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...
- proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}',
+ proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',
I think the additional column should be text[] instead of text, but
not sure.
And you need to edit the corresponding part of the doc.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hi,
On Fri, Jan 28, 2022 at 10:38:53AM +0900, Kyotaro Horiguchi wrote:
I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...
Yes that's something I would like too. But even if the PIDs could be store, I
don't think that having the information for an already completed checkpoint
would be of any use at all.
For the current checkpoint, it should also be an array of PID. For instance if
the checkpointer started a throttled checkpoint, then someone calls a non
immediate pg_start_backup() and finally thinks it's too slow and need a fast
checkpoint. This would be welcome in a new pg_stat_progress_checkpoint view.
At Fri, 28 Jan 2022 10:41:28 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
Sorry, the last message lacks one citation.
At Thu, 27 Jan 2022 19:09:29 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
What it's showing is the "currently ongoing checkpoint or last completed
checkpoint" kind.Ah after double checking I see it's storing the information *after* the
checkpoint completion, so it's indeed the last completed checkpoint. I'm not
sure how useful it can be, but ok.I don't see it useful (but don't oppose) to record checkpoint kind in
control file. It is a kind of realtime noncritical info and in the
first place retrievable from server log if needed. And I'm skeptical
that it is needed such frequently. Checkpoint kind is useful to check
max_wal_size's appropriateness if it is in a summarized form as in
pg_stat_bgwriter. On the other hand showing the same in a stats view
or the output of pg_control_checkpoint() is fine by me.Also, it's only showing the initial triggering conditions of checkpoints.
For instance, if a timed checkpoint is started and then a backend executes a
"CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
AFAICS those new flags won't be saved to the control file.This one is still valid I think, it's only storing the initial flags and not
the possibly upgraded one in shmem.Agreed.
+ snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s", + (flags == 0) ? "unknown" : "", + (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "", + (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "", + (flags & CHECKPOINT_IMMEDIATE) ? "immediate " : "", + (flags & CHECKPOINT_FORCE) ? "force " : "", + (flags & CHECKPOINT_WAIT) ? "wait " : "", + (flags & CHECKPOINT_CAUSE_XLOG) ? "wal " : "", + (flags & CHECKPOINT_CAUSE_TIME) ? "time " : "", + (flags & CHECKPOINT_FLUSH_ALL) ? "flush-all" : "");I don't like to add this complex-but-need-in-sync blob twice. If we
need to do that twice, I want them consolidated in any shape.+ Datum values[18];
+ bool nulls[18];You forgot to expand these arrays.
This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
and pg_upgrade need to treat the change.Even if we add checkpoint kind to control file, it would look a bit
strange that the "checkpoint kind" shows first among all
checkpoint-related lines. And at least the "wait" in the line is
really useless since it is not a property of a checkpoint. Instead, it
doesn't show "requested" which is one of the checkpoint properties
like "xlog" and "time". I'm not sure we need all of the properties to
be shown but I don't have a clear criteria for each property of it
ought to be shown or not.pg_control last modified: Fri 28 Jan 2022 09:49:46 AM JST
Latest checkpoint kind: immediate force wait
Latest checkpoint location: 0/172B2C8I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...- proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}', + proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',I think the additional column should be text[] instead of text, but
not sure.And you need to edit the corresponding part of the doc.
I have an additional comment.
+ char ckpt_kind[2 * MAXPGPATH];
..
+ snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s",
+ (flags == 0) ? "unknown" : "",
+ (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",
The value "2 * MAXPGPATH" is utterly nonsense about bouth "2" and
"MAXPGPATH", and the product of them is apparently too large.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Fri, 28 Jan 2022 10:00:46 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
Hi,
On Fri, Jan 28, 2022 at 10:38:53AM +0900, Kyotaro Horiguchi wrote:
I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...Yes that's something I would like too. But even if the PIDs could be store, I
don't think that having the information for an already completed checkpoint
would be of any use at all.For the current checkpoint, it should also be an array of PID. For instance if
the checkpointer started a throttled checkpoint, then someone calls a non
immediate pg_start_backup() and finally thinks it's too slow and need a fast
checkpoint. This would be welcome in a new pg_stat_progress_checkpoint view.
Yeah, I thought of the same issue. But at the time of the previous
mail I thought it would be enough that the PID is of the first
initiator of the current running checkpoint likewise the checkpoint
kind. So it is important to think about the use case. If we put
significancy on actually happend checkpoints, we don't need absorbed
checkpoint requests to be recorded but if we put significance on
checkpoint requests, we need to care of every checkpoint request.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Fri, Jan 28, 2022 at 7:30 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
Hi,
On Fri, Jan 28, 2022 at 10:38:53AM +0900, Kyotaro Horiguchi wrote:
I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...Yes that's something I would like too. But even if the PIDs could be store, I
don't think that having the information for an already completed checkpoint
would be of any use at all.For the current checkpoint, it should also be an array of PID. For instance if
the checkpointer started a throttled checkpoint, then someone calls a non
immediate pg_start_backup() and finally thinks it's too slow and need a fast
checkpoint. This would be welcome in a new pg_stat_progress_checkpoint view.
Thanks all for the comments. pg_stat_progress_checkpoint is being
discussed in another thread [1]/messages/by-id/CALj2ACV-F+K+z+XW8fnK4MV71qz2gzAMxFnYziRgZURMB5ycAQ@mail.gmail.com.
I will respond to the other comments soon.
[1]: /messages/by-id/CALj2ACV-F+K+z+XW8fnK4MV71qz2gzAMxFnYziRgZURMB5ycAQ@mail.gmail.com
Regards,
Bharath Rupireddy.
Hi,
On Fri, Jan 28, 2022 at 08:17:39AM +0530, Bharath Rupireddy wrote:
Thanks all for the comments. pg_stat_progress_checkpoint is being
discussed in another thread [1].[1] /messages/by-id/CALj2ACV-F+K+z+XW8fnK4MV71qz2gzAMxFnYziRgZURMB5ycAQ@mail.gmail.com
This thread was explicitly talking about log reporting and it's not clear to me
that it's now supposed to be about a new pg_stat_progress_checkpoint()
function. I think you should start a new thread or at least change the subject
to be clearer.