Tracking replication slot "blockings"
I'm thinking it could be interesting to know how many times (or in some
other useful unit than "times" - how often) a specific replication slot has
"blocked" xlog rotation. Since this AFAIK only happens during checkpoints,
it seems it should be "reasonably cheap" to track? It would serve as an
indicator of which slave(s) are having enough trouble keeping up to
potentially cause issues.
Not having looked at that code at all yet, would this be something that's
simple to add?
Or is it a silly idea? :)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Hi,
On 2014-04-16 18:51:41 +0200, Magnus Hagander wrote:
I'm thinking it could be interesting to know how many times (or in some
other useful unit than "times" - how often) a specific replication slot has
"blocked" xlog rotation. Since this AFAIK only happens during checkpoints,
it seems it should be "reasonably cheap" to track? It would serve as an
indicator of which slave(s) are having enough trouble keeping up to
potentially cause issues.
The xlog removal code just check the "global minimum" required LSN - it
doesn't check the individual slots. So you'd need to add a bit more code
to that location. But it'd be easy.
But I think I'd just monitor/graph the byte difference for all slots
using pg_replication_slots...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 16, 2014 at 6:56 PM, Andres Freund <andres@2ndquadrant.com>wrote:
Hi,
On 2014-04-16 18:51:41 +0200, Magnus Hagander wrote:
I'm thinking it could be interesting to know how many times (or in some
other useful unit than "times" - how often) a specific replication slothas
"blocked" xlog rotation. Since this AFAIK only happens during
checkpoints,
it seems it should be "reasonably cheap" to track? It would serve as an
indicator of which slave(s) are having enough trouble keeping up to
potentially cause issues.The xlog removal code just check the "global minimum" required LSN - it
doesn't check the individual slots. So you'd need to add a bit more code
to that location. But it'd be easy.
Do we have statistics there somewhere - how often that global minimum
blocks something? That on it's own might be a start :)
But I think I'd just monitor/graph the byte difference for all slots
using pg_replication_slots...
Yeah, that would work when monitored continously. I was more looking for
the view of "hey, could this be what happened" into a system that did not
previously have any monitoring installed and therefor no such history.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2014-04-16 19:09:09 +0200, Magnus Hagander wrote:
On Wed, Apr 16, 2014 at 6:56 PM, Andres Freund <andres@2ndquadrant.com>wrote:
The xlog removal code just check the "global minimum" required LSN - it
doesn't check the individual slots. So you'd need to add a bit more code
to that location. But it'd be easy.Do we have statistics there somewhere - how often that global minimum
blocks something? That on it's own might be a start :)
Nope. Check xlog.c:KeepLogSeg(), it's pretty simple stuff ;). It's the
same place where wal_keep_segments is enforced...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers