Proposal: Log inability to lock pages during vacuum
Currently, a non-freeze vacuum will punt on any page it can't get a cleanup lock on, with no retry. Presumably this should be a rare occurrence, but I think it's bad that we just assume that and won't warn the user if something bad is going on.
My thought is that if we skip any pages elog(LOG) how many we skipped. If we skip more than 1% of the pages we visited (not relpages) then elog(WARNING) instead.
Comments?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 20, 2014 at 2:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
Currently, a non-freeze vacuum will punt on any page it can't get a cleanup
lock on, with no retry. Presumably this should be a rare occurrence, but I
think it's bad that we just assume that and won't warn the user if something
bad is going on.My thought is that if we skip any pages elog(LOG) how many we skipped. If we
skip more than 1% of the pages we visited (not relpages) then elog(WARNING)
instead.
Is there some specific failure you've run into where a page was stuck
in a pinned state and never got vacuumed?
I would like to see a more systematic way of going about this. What
LSN or timestamp is associated with the oldest unvacuumed page? How
many times have we tried to visit it? What do those numbers look like
overall -- i.e. what's the median number of times it takes to vacuum a
page and what does the distribution look like of the unvacuumed ages?
With that data it should be possible to determine if the behaviour is
actually working well and where to draw the line to determine outliers
that might represent bugs.
--
greg
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/20/14, 10:29 AM, Greg Stark wrote:
On Mon, Oct 20, 2014 at 2:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
Currently, a non-freeze vacuum will punt on any page it can't get a cleanup
lock on, with no retry. Presumably this should be a rare occurrence, but I
think it's bad that we just assume that and won't warn the user if something
bad is going on.My thought is that if we skip any pages elog(LOG) how many we skipped. If we
skip more than 1% of the pages we visited (not relpages) then elog(WARNING)
instead.Is there some specific failure you've run into where a page was stuck
in a pinned state and never got vacuumed?
Not that I know of... but how would I actually know? Having that info available is the point of my proposal. :)
I would like to see a more systematic way of going about this. What
LSN or timestamp is associated with the oldest unvacuumed page? How
many times have we tried to visit it? What do those numbers look like
overall -- i.e. what's the median number of times it takes to vacuum a
page and what does the distribution look like of the unvacuumed ages?With that data it should be possible to determine if the behaviour is
actually working well and where to draw the line to determine outliers
that might represent bugs.
I agree we could use better data about/for vacuum (see /messages/by-id/544468C1.6050101@BlueTreble.com).
In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption), it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to find that out. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-10-20 19:18:31 -0500, Jim Nasby wrote:
In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption), it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to fin> that out. :)
It does happen, and not infrequently. Just not enough pages to normally
cause significant bloat. The most likely place where it happens is very
small tables that all connections hit with a high frequency. Starting to
issue high volume log spew for a nonexistant problem isn't helping.
If you're super convinced this is urgent then add it as a *single*
datapoint inside the existing messages. But I think there's loads of
stuff in vacuum logging that are more important this.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/20/14, 7:31 PM, Andres Freund wrote:
On 2014-10-20 19:18:31 -0500, Jim Nasby wrote:
In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption), it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to fin> that out.:)
It does happen, and not infrequently. Just not enough pages to normally
cause significant bloat. The most likely place where it happens is very
small tables that all connections hit with a high frequency. Starting to
issue high volume log spew for a nonexistant problem isn't helping.
How'd you determine that? Is there some way to measure this? I'm not doubting you; I just don't want to work on a problem that's already solved.
If you're super convinced this is urgent then add it as a*single*
datapoint inside the existing messages. But I think there's loads of
stuff in vacuum logging that are more important this.
See my original proposal; at it's most intrusive this would issue one warning per (auto)vacuum if it was over a certain threshold. I would think that a DBA would really like to know when this happens, but if we think that's too much spew we can limit it to normal vacuum logging.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-10-20 19:43:38 -0500, Jim Nasby wrote:
On 10/20/14, 7:31 PM, Andres Freund wrote:
On 2014-10-20 19:18:31 -0500, Jim Nasby wrote:
In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption), it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to fin> that out.:)
It does happen, and not infrequently. Just not enough pages to normally
cause significant bloat. The most likely place where it happens is very
small tables that all connections hit with a high frequency. Starting to
issue high volume log spew for a nonexistant problem isn't helping.How'd you determine that? Is there some way to measure this?
You'd seen individual pages with too old dead rows in them.
If you're super convinced this is urgent then add it as a*single*
datapoint inside the existing messages. But I think there's loads of
stuff in vacuum logging that are more important this.See my original proposal; at it's most intrusive this would issue one
warning per (auto)vacuum if it was over a certain threshold.
Which would vastly increase the log output for setups with small tables
and a nonzero log_autovacuum_min_duration.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby wrote:
Currently, a non-freeze vacuum will punt on any page it can't get a
cleanup lock on, with no retry. Presumably this should be a rare
occurrence, but I think it's bad that we just assume that and won't
warn the user if something bad is going on.
I think if you really want to attack this problem, rather than just
being noisy about it, what you could do is to keep a record of which
page numbers you had to skip, and then once you're done with your first
scan you go back and retry the lock on the pages you skipped.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/21/14, 5:39 PM, Alvaro Herrera wrote:
Jim Nasby wrote:
Currently, a non-freeze vacuum will punt on any page it can't get a
cleanup lock on, with no retry. Presumably this should be a rare
occurrence, but I think it's bad that we just assume that and won't
warn the user if something bad is going on.I think if you really want to attack this problem, rather than just
being noisy about it, what you could do is to keep a record of which
page numbers you had to skip, and then once you're done with your first
scan you go back and retry the lock on the pages you skipped.
I'm OK with that if the community is; I was just trying for minimum invasiveness.
If I go this route, I'd like some input though...
- How to handle storing the blockIDs. Fixed size array or something fancier? What should we limit it to, especially since we're already allocating maintenance_work_mem for the tid array.
- What happens if we run out of space to remember skipped blocks? I could do something like what we do for running out of space in the dead_tuples array, but I'm worried that will add a serious amount of complexity, especially since re-processing these blocks could be what actually pushes us over the limit.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
- What happens if we run out of space to remember skipped blocks?
You forget some, and are no worse off than today. (This might be an
event worthy of logging, if the array is large enough that we don't
expect it to happen often ...)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/21/14, 6:05 PM, Tom Lane wrote:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
- What happens if we run out of space to remember skipped blocks?
You forget some, and are no worse off than today. (This might be an
event worthy of logging, if the array is large enough that we don't
expect it to happen often ...)
Makes sense. I'll see if there's some reasonable way to retry pages when the array fills up.
I'll make the array 2k in size; that allows for 512 pages without spending a bunch of memory.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/29/14, 11:49 AM, Jim Nasby wrote:
On 10/21/14, 6:05 PM, Tom Lane wrote:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
- What happens if we run out of space to remember skipped blocks?
You forget some, and are no worse off than today. (This might be an
event worthy of logging, if the array is large enough that we don't
expect it to happen often ...)Makes sense. I'll see if there's some reasonable way to retry pages when the array fills up.
I'll make the array 2k in size; that allows for 512 pages without spending a bunch of memory.
Attached is a patch for this. It also adds logging of unobtainable cleanup locks, and refactors scanning a page for vacuum into it's own function.
Anyone reviewing this might want to look at https://github.com/decibel/postgres/commit/69ab22f703d577cbb3d8036e4e42563977bcf74b, which is the refactor with no whitespace changes.
I've verified this works correctly by connecting to a backend with gdb and halting it with a page pinned. Both vacuum and vacuum freeze on that table do what's expected, but I also get this waring (which AFAICT is a false positive):
decibel@decina.local=# vacuum verbose i;
INFO: vacuuming "public.i"
INFO: "i": found 0 removable, 399774 nonremovable row versions in 1769 out of 1770 pages
DETAIL: 200000 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
Retried cleanup lock on 0 pages, retry failed on 1, skipped retry on 0.
CPU 0.00s/0.06u sec elapsed 12.89 sec.
WARNING: buffer refcount leak: [105] (rel=base/16384/16385, blockNum=0, flags=0x106, refcount=2 1)
VACUUM
I am doing a simple static allocation of retry_pages[]; my understanding is that will only exist for the duration of this function so it's OK. If not I'll palloc it. If it is OK then I'll do the same for the freeze array.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
Attachments:
0001-Vacuum-cleanup-lock-retry.patchtext/plain; charset=UTF-8; name=0001-Vacuum-cleanup-lock-retry.patch; x-mac-creator=0; x-mac-type=0Download+541-424
On 2014-11-06 14:55:37 -0600, Jim Nasby wrote:
On 10/29/14, 11:49 AM, Jim Nasby wrote:
On 10/21/14, 6:05 PM, Tom Lane wrote:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
- What happens if we run out of space to remember skipped blocks?
You forget some, and are no worse off than today. (This might be an
event worthy of logging, if the array is large enough that we don't
expect it to happen often ...)Makes sense. I'll see if there's some reasonable way to retry pages when the array fills up.
I'll make the array 2k in size; that allows for 512 pages without spending a bunch of memory.
Attached is a patch for this. It also adds logging of unobtainable cleanup locks, and refactors scanning a page for vacuum into it's own function.
Anyone reviewing this might want to look at https://github.com/decibel/postgres/commit/69ab22f703d577cbb3d8036e4e42563977bcf74b, which is the refactor with no whitespace changes.
I've verified this works correctly by connecting to a backend with gdb and halting it with a page pinned. Both vacuum and vacuum freeze on that table do what's expected, but I also get this waring (which AFAICT is a false positive):
I think the retry logical is a largely pointless complication of already
complex enough code. You're fixing a problem for which there is
absolutely no evidence of its existance. Yes, this happens
occasionally. But it's going to be so absolutely minor in comparison to
just about every other source of bloat.
So, I pretty strongly against retrying. I could live with adding logging
of the number of pages skipped due to not being able to acquire the
cleanup lock. I don't think that's going to do help many people, but the
cost is pretty low.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote:
I think the retry logical is a largely pointless complication of already
complex enough code. You're fixing a problem for which there is
absolutely no evidence of its existance. Yes, this happens
occasionally. But it's going to be so absolutely minor in comparison to
just about every other source of bloat.
I agree bloat isn't really a threat, but what about the relfrozenxid?
If we skip even one page we don't get to advance it and retrying could
eliminate those skipped pages and allow us to avoid a vacuum freeze
which can be really painful. Of course that only works if you can be
sure you haven't overflowed and forgotten any skipped pages and if you
don't find the page still pinned every time until you eventually give
up on it.
--
greg
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/6/14, 5:40 PM, Greg Stark wrote:
On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote:
I think the retry logical is a largely pointless complication of already
complex enough code. You're fixing a problem for which there is
absolutely no evidence of its existance. Yes, this happens
occasionally. But it's going to be so absolutely minor in comparison to
just about every other source of bloat.
For some reason I don't have Andres' original email, so I'll reply here: I agree with you, and my original proposal was simply to log how many pages were skipped, but that was objected to. Simply logging this extra information would be a patch of a dozen lines or less.
The problem right now is there's no way to actually obtain evidence that this is (or isn't) something to worry about, because we just silently skip pages. If we had any kind of tracking on this we could stop guessing. :(
I agree bloat isn't really a threat, but what about the relfrozenxid?
If we skip even one page we don't get to advance it and retrying could
eliminate those skipped pages and allow us to avoid a vacuum freeze
which can be really painful. Of course that only works if you can be
sure you haven't overflowed and forgotten any skipped pages and if you
don't find the page still pinned every time until you eventually give
up on it.
The overflow part shouldn't be that big a deal. Either we just bump the array size up some more, or worst-case we scan it whenever it fills (like we do when we fill vacrelstats->dead_tuples.
But like I said above, I think this is already making a mountain out of a mole-hill, until we have evidence there's a real problem.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-11-06 23:40:18 +0000, Greg Stark wrote:
On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote:
I think the retry logical is a largely pointless complication of already
complex enough code. You're fixing a problem for which there is
absolutely no evidence of its existance. Yes, this happens
occasionally. But it's going to be so absolutely minor in comparison to
just about every other source of bloat.I agree bloat isn't really a threat, but what about the relfrozenxid?
If we skip even one page we don't get to advance it and retrying could
eliminate those skipped pages and allow us to avoid a vacuum freeze
which can be really painful. Of course that only works if you can be
sure you haven't overflowed and forgotten any skipped pages and if you
don't find the page still pinned every time until you eventually give
up on it.
I don't buy this argument. Either you're constantly vacuuming the whole
relation anyway - in which case we'll acquire the cleanup lock
unconditionally - or we're doing partial vacuums via the visibility map
anyway.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-11-06 19:03:20 -0600, Jim Nasby wrote:
On 11/6/14, 5:40 PM, Greg Stark wrote:
On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote:
I think the retry logical is a largely pointless complication of already
complex enough code. You're fixing a problem for which there is
absolutely no evidence of its existance. Yes, this happens
occasionally. But it's going to be so absolutely minor in comparison to
just about every other source of bloat.For some reason I don't have Andres' original email, so I'll reply
here: I agree with you, and my original proposal was simply to log how
many pages were skipped, but that was objected to. Simply logging this
extra information would be a patch of a dozen lines or less.
The objection was that it's unneccessary complexity. So you made the
patch a magnitude more complex *and* added logging? That doesn't make
much sense.
The problem right now is there's no way to actually obtain evidence
that this is (or isn't) something to worry about, because we just
silently skip pages. If we had any kind of tracking on this we could
stop guessing. :(
What's the worst consequence this could have? A couple pages not marked
all visible and not immediately cleaned up. That's not particularly
harmful.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Greg Stark wrote:
I agree bloat isn't really a threat, but what about the relfrozenxid?
If we skip even one page we don't get to advance it and retrying could
eliminate those skipped pages and allow us to avoid a vacuum freeze
which can be really painful. Of course that only works if you can be
sure you haven't overflowed and forgotten any skipped pages and if you
don't find the page still pinned every time until you eventually give
up on it.
We never advance relfrozenxid nowadays unless it's a whole-table scan;
and once we commit to doing those (vacuum_freeze_table_age is past), we
don't skip pages anymore.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
The problem right now is there's no way to actually obtain evidence that
this is (or isn't) something to worry about, because we just silently skip
pages. If we had any kind of tracking on this we could stop guessing. :(
I could see logging it, but I agree with Andres and Alvaro that the
odds are strongly against there being any actual problem here.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/7/14, 8:21 PM, Robert Haas wrote:
On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
The problem right now is there's no way to actually obtain evidence that
this is (or isn't) something to worry about, because we just silently skip
pages. If we had any kind of tracking on this we could stop guessing. :(I could see logging it, but I agree with Andres and Alvaro that the
odds are strongly against there being any actual problem here.
I'm fine with that. Any other objections? Andres?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby wrote:
On 11/7/14, 8:21 PM, Robert Haas wrote:
On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
The problem right now is there's no way to actually obtain evidence that
this is (or isn't) something to worry about, because we just silently skip
pages. If we had any kind of tracking on this we could stop guessing. :(I could see logging it, but I agree with Andres and Alvaro that the
odds are strongly against there being any actual problem here.I'm fine with that. Any other objections? Andres?
If what we want is to quantify the extent of the issue, would it be more
convenient to save counters to pgstat? Vacuum already sends pgstat
messages, so there's no additional traffic there.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers