[TODO] Track number of files ready to be archived in pg_stat_archiver

Started by Julien Rouhaudover 11 years ago15 messageshackers
Jump to latest
#1Julien Rouhaud
rjuju123@gmail.com

Hello,

Attached patch implements the following TODO item :

Track number of WAL files ready to be archived in pg_stat_archiver

However, it will track the total number of any file ready to be
archived, not only WAL files.

Please let me know what you think about it.

Regards.
--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachments:

pg_stat_archiver_ready_count-v1.patchtext/x-patch; name=pg_stat_archiver_ready_count-v1.patchDownload+128-74
#2Gilles Darold
gilles@darold.net
In reply to: Julien Rouhaud (#1)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Le 21/08/2014 10:17, Julien Rouhaud a écrit :

Hello,

Attached patch implements the following TODO item :

Track number of WAL files ready to be archived in pg_stat_archiver

However, it will track the total number of any file ready to be
archived, not only WAL files.

Please let me know what you think about it.

Regards.

Hi,

Maybe looking at archive ready count will be more efficient if it is
done in the view definition through a function. This will avoid any
issue with incrementing/decrement of archiverStats.ready_count and the
patch will be more simple. Also I don't think we need an other memory
allocation for that, the counter information is always in the number of
.ready files in the archive_status directory and the call to
pg_stat_archiver doesn't need high speed performances.

For example having a new function called
pg_stat_get_archive_ready_count() that does the same at what you add
into pgstat_read_statsfiles() and the pg_stat_archiver defined as follow:

CREATE VIEW pg_stat_archiver AS
s.failed_count,
s.last_failed_wal,
s.last_failed_time,
pg_stat_get_archive_ready() as ready_count,
s.stats_reset
FROM pg_stat_get_archiver() s;

The function pg_stat_get_archive_ready_count() will also be available
for any other querying.

--
Gilles Darold
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Julien Rouhaud
rjuju123@gmail.com
In reply to: Gilles Darold (#2)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Le 25/08/2014 19:00, Gilles Darold a écrit :

Le 21/08/2014 10:17, Julien Rouhaud a écrit :

Hello,

Attached patch implements the following TODO item :

Track number of WAL files ready to be archived in pg_stat_archiver

However, it will track the total number of any file ready to be
archived, not only WAL files.

Please let me know what you think about it.

Regards.

Hi,

Maybe looking at archive ready count will be more efficient if it is
done in the view definition through a function. This will avoid any
issue with incrementing/decrement of archiverStats.ready_count and the
patch will be more simple. Also I don't think we need an other memory
allocation for that, the counter information is always in the number of
.ready files in the archive_status directory and the call to
pg_stat_archiver doesn't need high speed performances.

For example having a new function called
pg_stat_get_archive_ready_count() that does the same at what you add
into pgstat_read_statsfiles() and the pg_stat_archiver defined as follow:

CREATE VIEW pg_stat_archiver AS
s.failed_count,
s.last_failed_wal,
s.last_failed_time,
pg_stat_get_archive_ready() as ready_count,
s.stats_reset
FROM pg_stat_get_archiver() s;

The function pg_stat_get_archive_ready_count() will also be available
for any other querying.

Indeed, this approach should be more efficient. It also avoid unexpected
results, like if someone has the bad idea to remove a .ready file in
pg_xlog/archive_status directory.

Attached v2 patch implements this approach. All the work is still done
in pg_stat_get_archiver, as I don't think that having a specific
function for that information would be really interesting.
--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachments:

pg_stat_archiver_ready_count-v2.patchtext/x-patch; name=pg_stat_archiver_ready_count-v2.patchDownload+54-18
#4Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#3)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Thu, Aug 28, 2014 at 7:37 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Attached v2 patch implements this approach. All the work is still done
in pg_stat_get_archiver, as I don't think that having a specific
function for that information would be really interesting.

Please be sure to add that to the next commit fest. This is a feature
most welcome within this system view.
Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#4)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Le 28/08/2014 05:58, Michael Paquier a écrit :

On Thu, Aug 28, 2014 at 7:37 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Attached v2 patch implements this approach. All the work is still
done in pg_stat_get_archiver, as I don't think that having a
specific function for that information would be really
interesting.

Please be sure to add that to the next commit fest. This is a
feature most welcome within this system view. Regards,

I just added it.

Thanks.

- --
Julien Rouhaud
http://dalibo.com - http://dalibo.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUAMv6AAoJELGaJ8vfEpOqZgIIAKNp0a4XaZNRtEw3+yZogxLD
RIpSnURh1COEZs5UUkdsuybvLqOqZXbCQWfK9+3B3pqoYD9LTIzlg4jcArOcbqgd
Fe43BEH4QYabjdS1DWGSzop9E0NY/Vg82ZGzyHzGYQKI1k9Y/pEeM5q74vRN3aH0
RbUbcnN0ajCMswLbjfc/nDXNCDAr96peLZoI1l2lW7fJIElkXJz/I28fNAHtj7Dg
hxmBXf8uVZ7g+pCVIhLodFm4mp4ZB0ZvTHxDHCXU9wH/p7otDD4GV0Cml9DlSfE6
cFm0CXfeMHawaihz6bs8Z1Zxntdh7Qy+lAHmBRuXZUwzaJYTDxwL/YCvnSsVE9o=
=kD4R
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Adam Brightwell
adam.brightwell@crunchydatasolutions.com
In reply to: Julien Rouhaud (#5)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Julien,

The following is an initial review:

* Applies cleanly to master (f330a6d).
* Regression tests updated and pass, including 'check-world'.
* Documentation updated and builds successfully.
* Might want to consider replacing the following magic number with a
constant or perhaps calculated value.

+ int basenamelen = (int) strlen(rlde->d_name) - 6;

* Wouldn't it be easier, or perhaps more reliable to use "strrchr()" with
the following instead?

+ strcmp(rlde->d_name + basenamelen, ".ready") == 0)

char *extension = strrchr(ride->d_name, '.');
...
strcmp(extension, ".ready") == 0)

I think this approach might also help to resolve the magic number above.
For example:

char *extension = strrchr(ride->d_name, '.');
int basenamelen = (int) strlen(ride->d_name) - strlen(extension);

-Adam

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

#7Julien Rouhaud
rjuju123@gmail.com
In reply to: Adam Brightwell (#6)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Tue, Oct 21, 2014 at 7:35 AM, Brightwell, Adam <
adam.brightwell@crunchydatasolutions.com> wrote:

Julien,

The following is an initial review:

Thanks for the review.

* Applies cleanly to master (f330a6d).
* Regression tests updated and pass, including 'check-world'.
* Documentation updated and builds successfully.
* Might want to consider replacing the following magic number with a
constant or perhaps calculated value.

+ int basenamelen = (int) strlen(rlde->d_name) - 6;

* Wouldn't it be easier, or perhaps more reliable to use "strrchr()" with
the following instead?

+ strcmp(rlde->d_name + basenamelen, ".ready") == 0)

char *extension = strrchr(ride->d_name, '.');
...
strcmp(extension, ".ready") == 0)

I think this approach might also help to resolve the magic number above.
For example:

char *extension = strrchr(ride->d_name, '.');
int basenamelen = (int) strlen(ride->d_name) - strlen(extension);

Actually, I used the same loop as the archiver one (see
backend/postmaster/pgarch.c, function pgarch_readyXlog) to get the exact
same number of files.

If we change it in this patch, it would be better to change it everywhere.
What do you think ?

-Adam

Show quoted text

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

#8Adam Brightwell
adam.brightwell@crunchydatasolutions.com
In reply to: Julien Rouhaud (#7)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Julien,

Actually, I used the same loop as the archiver one (see
backend/postmaster/pgarch.c, function pgarch_readyXlog) to get the exact
same number of files.

Ah, I see.

If we change it in this patch, it would be better to change it everywhere.
What do you think ?

Hmm... I'd have to defer to the better judgement of a committer on that
one. Though, I would think that the general desire would be to keep the
patch relevant ONLY to the necessary changes. I would not qualify making
those types of changes as relevant, IMHO. I do think this is potential for
cleanup, however, I would suspect that would be best done in a separate
patch. But again, I'd defer to a committer whether such changes are even
necessary/acceptable.

-Adam

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Julien Rouhaud (#1)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On 21 August 2014 09:17, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

Track number of WAL files ready to be archived in pg_stat_archiver

Would it be OK to ask what the purpose of this TODO item is?

pg_stat_archiver already has a column for last_archived_wal and
last_failed_wal, so you can already work out how many files there must
be between then and now. Perhaps that can be added directly to the
view, to assist the user in calculating it. Reading the directory
itself to count the file is unnecessary, except as a diagnostic.

Please don't take "it is a TODO item" as "generally accepeted that
this makes sense".

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Michael Paquier
michael@paquier.xyz
In reply to: Simon Riggs (#9)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Tue, Nov 18, 2014 at 5:47 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 21 August 2014 09:17, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

Track number of WAL files ready to be archived in pg_stat_archiver

Would it be OK to ask what the purpose of this TODO item is?

pg_stat_archiver already has a column for last_archived_wal and
last_failed_wal, so you can already work out how many files there must
be between then and now. Perhaps that can be added directly to the
view, to assist the user in calculating it. Reading the directory
itself to count the file is unnecessary, except as a diagnostic.

Not sure if this holds true in a node freshly started from a base
backup with a set of WAL files, or with files manually copied by an
operator.

Please don't take "it is a TODO item" as "generally accepeted that
this makes sense".

On systems where the WAL archiving is slower than WAL generation at
peak time, the DBA may want to know how long is the queue of WAL files
waiting to be archived. That's IMO something we simply forgot in the
first implementation of pg_stat_archiver, and the most direct way to
know that is to count the .ready files in archive_status.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Michael Paquier
michael@paquier.xyz
In reply to: Adam Brightwell (#8)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Wed, Oct 22, 2014 at 12:50 AM, Brightwell, Adam
<adam.brightwell@crunchydatasolutions.com> wrote:

Though, I would think that the general desire would be to keep the patch
relevant ONLY to the necessary changes. I would not qualify making those
types of changes as relevant, IMHO. I do think this is potential for
cleanup, however, I would suspect that would be best done in a separate
patch. But again, I'd defer to a committer whether such changes are even
necessary/acceptable.

I have been looking at this patch, and I think that it is a mistake to
count the .ready files present in archive_status when calling
pg_stat_get_archiver(). If there are many files waiting to be
archived, this penalizes the run time of this function, and the
application behind relying on those results, not to mention that
actually the loop used to count the .ready files is a copy of what is
in pgarch.c. Hence I think that we should simply count them in
pgarch_readyXlog, and then return a value back to
pgarch_ArchiverCopyLoop, value that could be decremented by 1 each
time a file is successfully archived to keep the stats as precise as
possible, and let the information know useful information when
archiver process is within a single loop process of
pgarch_ArchiverCopyLoop. This way, we just need to extend
PgStat_MsgArchiver with a new counter to track this number.

The attached patch, based on v2 sent previously, does so. Thoughts?
--
Michael

Attachments:

pg_stat_archiver_ready_count-v3.patchtext/x-patch; charset=US-ASCII; name=pg_stat_archiver_ready_count-v3.patchDownload+52-25
#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Michael Paquier (#10)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On 18 November 2014 06:20, Michael Paquier <michael.paquier@gmail.com> wrote:

the DBA may want to know how long is the queue of WAL files
waiting to be archived.

Agreed

That's IMO something we simply forgot in the
first implementation of pg_stat_archiver

That's not how it appears to me. ISTM that the information requested
is already available, it just needs some minor calculations to work
out how many files are required.

the most direct way to
know that is to count the .ready files in archive_status.

...my earlier point was...

pg_stat_archiver already has a column for last_archived_wal and
last_failed_wal, so you can already work out how many files there must
be between then and now. Perhaps that can be added directly to the
view, to assist the user in calculating it. Reading the directory
itself to count the file is unnecessary, except as a diagnostic.

As soon as we have sent the first file, we will know the queue length
at any point afterwards.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#11)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Hearing nothing from the original author, this patch that was in state
"Waiting on Author" for a couple of days is switched to "returned with
feedback".
Regards,
--
Michael

#14Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#11)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Le 18/11/2014 08:36, Michael Paquier a écrit :

On Wed, Oct 22, 2014 at 12:50 AM, Brightwell, Adam
<adam.brightwell@crunchydatasolutions.com> wrote:

Though, I would think that the general desire would be to keep
the patch relevant ONLY to the necessary changes. I would not
qualify making those types of changes as relevant, IMHO. I do
think this is potential for cleanup, however, I would suspect
that would be best done in a separate patch. But again, I'd
defer to a committer whether such changes are even
necessary/acceptable.

I have been looking at this patch, and I think that it is a mistake
to count the .ready files present in archive_status when calling
pg_stat_get_archiver(). If there are many files waiting to be
archived, this penalizes the run time of this function, and the
application behind relying on those results, not to mention that
actually the loop used to count the .ready files is a copy of what
is in pgarch.c. Hence I think that we should simply count them in
pgarch_readyXlog, and then return a value back to
pgarch_ArchiverCopyLoop, value that could be decremented by 1 each
time a file is successfully archived to keep the stats as precise
as possible, and let the information know useful information when
archiver process is within a single loop process of
pgarch_ArchiverCopyLoop. This way, we just need to extend
PgStat_MsgArchiver with a new counter to track this number.

The attached patch, based on v2 sent previously, does so.
Thoughts?

Sorry for this late answer.

I agree with you about the problems of the v2 patch I originally sent.
I think this v3 is the right way of keeping track of .ready files, so
it's ok for me. The v3 also still applies well on current head.

Regards.
- --
Julien Rouhaud
http://dalibo.com - http://dalibo.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUjFLWAAoJELGaJ8vfEpOqV9AIAI1yTUYqiB8rMJpfM47IHiM6
92fRNJ7sGwuFKD7Vb2gcMuRLelhFVRevJ7tjhggci8Y36j6YDXgqz74kTjkXvcjN
/SlyS2CIcSleWwvJ2A/WZM0rIzbtm1DTahKupQQ8UdcjHsk3m8T+nySIGyQWdKzz
X9JiXATztlevAaC/1Mf+zsbDSzW5tiQVfIm835G1/sEqIXh43TQyyXyr/nJFlFfQ
85OPssInrxt1e2F82s3SoXb7lIBZg77fZTEusxG5zHX5ANF6uMpF7CBJiZXezRYw
xWrKKuJBLw4zSimzNsVYpxNN3jJuANEAkvzIV+glKDYD57A3DbmpYSJ+btXtDIw=
=JKhg
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#14)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Sat, Dec 13, 2014 at 11:53 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

I agree with you about the problems of the v2 patch I originally sent.
I think this v3 is the right way of keeping track of .ready files, so
it's ok for me. The v3 also still applies well on current head.

Simon got a good point mentioning that we can currently estimate the
number of files to be archived with the information that we have now
as the logic in the archiver is made as such. This information would
still be useful for a node freshly promoted that needs to promote a
bunch of files btw... But now there are as well discussions about
having a node only archive WAL files it produces, aka a master
archiving only WAL files on its current timeline, so we wouldn't
really need this patch if that's done.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers