Should we remove vacuum_defer_cleanup_age?

Started by Andres Freundabout 3 years ago30 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

As evidenced by the bug fixed in be504a3e974, vacuum_defer_cleanup_age is not
heavily used - the bug was trivial to hit as soon as vacuum_defer_cleanup_age
is set to a non-toy value. It complicates thinking about visibility horizons
substantially, as vacuum_defer_cleanup_age can make them go backward
substantially. Obviously it's also severely undertested.

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

vacuum_defer_cleanup_age was added as part of hot standby. Back then we did
not yet have hot_standby_feedback. Now that that exists,
vacuum_defer_cleanup_age doesn't seem like a good idea anymore. It's
pessimisistic, i.e. always retains rows, even if none of the standbys has an
old enough snapshot.

The only benefit of vacuum_defer_cleanup_age over hot_standby_feedback is that
it provides a limit of some sort. But transactionids aren't producing dead
rows in a uniform manner, so limiting via xid isn't particularly useful. And
even if there are use cases, it seems those would be served better by
introducing a cap on how much hot_standby_feedback can hold the horizon back.

I don't think I have the cycles to push this through in the next weeks, but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

Greetings,

Andres Freund

#2Magnus Hagander
magnus@hagander.net
In reply to: Andres Freund (#1)
Re: Should we remove vacuum_defer_cleanup_age?

On Sat, Mar 18, 2023 at 12:09 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

As evidenced by the bug fixed in be504a3e974, vacuum_defer_cleanup_age is
not
heavily used - the bug was trivial to hit as soon as
vacuum_defer_cleanup_age
is set to a non-toy value. It complicates thinking about visibility
horizons
substantially, as vacuum_defer_cleanup_age can make them go backward
substantially. Obviously it's also severely undertested.

I started writing a test for vacuum_defer_cleanup_age while working on the
fix
referenced above, but now I am wondering if said energy would be better
spent
removing vacuum_defer_cleanup_age alltogether.

vacuum_defer_cleanup_age was added as part of hot standby. Back then we did
not yet have hot_standby_feedback. Now that that exists,
vacuum_defer_cleanup_age doesn't seem like a good idea anymore. It's
pessimisistic, i.e. always retains rows, even if none of the standbys has
an
old enough snapshot.

The only benefit of vacuum_defer_cleanup_age over hot_standby_feedback is
that
it provides a limit of some sort. But transactionids aren't producing dead
rows in a uniform manner, so limiting via xid isn't particularly useful.
And
even if there are use cases, it seems those would be served better by
introducing a cap on how much hot_standby_feedback can hold the horizon
back.

I don't think I have the cycles to push this through in the next weeks,
but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

+1. I haven't seen any (correct) use of this in many many years on my end
at least.

And yes, having a cap on hot_standby_feedback would also be great.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#1)
Re: Should we remove vacuum_defer_cleanup_age?

On 2023-Mar-17, Andres Freund wrote:

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

+1 I agree it's not useful anymore.

I don't think I have the cycles to push this through in the next weeks, but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

Hmm, for the time being, can we just "disable" it by disallowing to set
the GUC to a value different from 0? Then we can remove the code later
in the cycle at leisure.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"

#4Justin Pryzby
pryzby@telsasoft.com
In reply to: Alvaro Herrera (#3)
Re: Should we remove vacuum_defer_cleanup_age?

On Sat, Mar 18, 2023 at 10:33:57AM +0100, Alvaro Herrera wrote:

On 2023-Mar-17, Andres Freund wrote:

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

+1 I agree it's not useful anymore.

I don't think I have the cycles to push this through in the next weeks, but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

Hmm, for the time being, can we just "disable" it by disallowing to set
the GUC to a value different from 0? Then we can remove the code later
in the cycle at leisure.

It can be useful to do a "rolling transition", and it's something I do
often.

But I can't see why that would be useful here? It seems like something
that could be done after the feature freeze. It's removing a feature,
not adding one.

--
Justin

#5Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#4)
Re: Should we remove vacuum_defer_cleanup_age?

Hi,

On 2023-03-22 11:44:20 -0500, Justin Pryzby wrote:

On Sat, Mar 18, 2023 at 10:33:57AM +0100, Alvaro Herrera wrote:

On 2023-Mar-17, Andres Freund wrote:

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

+1 I agree it's not useful anymore.

I don't think I have the cycles to push this through in the next weeks, but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

Hmm, for the time being, can we just "disable" it by disallowing to set
the GUC to a value different from 0? Then we can remove the code later
in the cycle at leisure.

It can be useful to do a "rolling transition", and it's something I do
often.

But I can't see why that would be useful here? It seems like something
that could be done after the feature freeze. It's removing a feature,
not adding one.

It wasn't actually that much work to write a patch to remove
vacuum_defer_cleanup_age, see the attached.

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

Greetings,

Andres Freund

Attachments:

v1-0001-Remove-vacuum_defer_cleanup_age.patchtext/x-diff; charset=us-asciiDownload+11-166
#6Daniel Gustafsson
daniel@yesql.se
In reply to: Andres Freund (#5)
Re: Should we remove vacuum_defer_cleanup_age?

On 22 Mar 2023, at 18:00, Andres Freund <andres@anarazel.de> wrote:

It wasn't actually that much work to write a patch to remove
vacuum_defer_cleanup_age, see the attached.

-    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
+    provides protection against
     relevant rows being removed by vacuum, but the former provides no
     protection during any time period when the standby is not connected,
     and the latter often needs to be set to a high value to provide adequate

Isn't "the latter" in the kept part of the sentence referring to the guc we're
removing here?

-	 * It's possible that slots / vacuum_defer_cleanup_age backed up the
-	 * horizons further than oldest_considered_running. Fix.
+	 * It's possible that slots backed up the horizons further than
+	 * oldest_considered_running. Fix.

While not the fault of this patch, can't we use the opportunity to expand
"Fix." to a short "Fix this by ..." sentence? Or remove "Fix." perhaps, either
of those would improve the comment IMHO.

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

It might be late in the cycle, but as it's not adding something that might
break but rather removing something that's known for being problematic (and not
useful) I think it's Ok.

--
Daniel Gustafsson

#7Andres Freund
andres@anarazel.de
In reply to: Daniel Gustafsson (#6)
Re: Should we remove vacuum_defer_cleanup_age?

Hi,

On 2023-03-23 10:18:35 +0100, Daniel Gustafsson wrote:

On 22 Mar 2023, at 18:00, Andres Freund <andres@anarazel.de> wrote:

It wasn't actually that much work to write a patch to remove
vacuum_defer_cleanup_age, see the attached.

-    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
+    provides protection against
relevant rows being removed by vacuum, but the former provides no
protection during any time period when the standby is not connected,
and the latter often needs to be set to a high value to provide adequate

Isn't "the latter" in the kept part of the sentence referring to the guc we're
removing here?

You're right. That paragraph generally seems a bit off. In HEAD:

<para>
In lieu of using replication slots, it is possible to prevent the removal
of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
storing the segments in an archive using
<xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
However, these methods often result in retaining more WAL segments than
required, whereas replication slots retain only the number of segments
known to be needed. On the other hand, replication slots can retain so
many WAL segments that they fill up the space allocated
for <literal>pg_wal</literal>;
<xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
retained by replication slots.
</para>
<para>
Similarly, <xref linkend="guc-hot-standby-feedback"/>
and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
relevant rows being removed by vacuum, but the former provides no
protection during any time period when the standby is not connected,
and the latter often needs to be set to a high value to provide adequate
protection. Replication slots overcome these disadvantages.
</para>

Replication slots alone don't prevent row removal, that requires
hot_standby_feedback to be used as well.

A minimal rephrasing would be:
<para>
Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
also using a replication slot, provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected. Replication slots overcome these
disadvantages.
</para>

-	 * It's possible that slots / vacuum_defer_cleanup_age backed up the
-	 * horizons further than oldest_considered_running. Fix.
+	 * It's possible that slots backed up the horizons further than
+	 * oldest_considered_running. Fix.

While not the fault of this patch, can't we use the opportunity to expand
"Fix." to a short "Fix this by ..." sentence? Or remove "Fix." perhaps, either
of those would improve the comment IMHO.

Perhaps unsurprisingly, given that I wrote that comment, I don't see a problem
with the "Fix."...

Greetings,

Andres Freund

#8Daniel Gustafsson
daniel@yesql.se
In reply to: Andres Freund (#7)
Re: Should we remove vacuum_defer_cleanup_age?

On 24 Mar 2023, at 21:27, Andres Freund <andres@anarazel.de> wrote:
On 2023-03-23 10:18:35 +0100, Daniel Gustafsson wrote:

On 22 Mar 2023, at 18:00, Andres Freund <andres@anarazel.de> wrote:

It wasn't actually that much work to write a patch to remove
vacuum_defer_cleanup_age, see the attached.

-    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
+    provides protection against
relevant rows being removed by vacuum, but the former provides no
protection during any time period when the standby is not connected,
and the latter often needs to be set to a high value to provide adequate

Isn't "the latter" in the kept part of the sentence referring to the guc we're
removing here?

You're right. That paragraph generally seems a bit off. In HEAD:

...

Replication slots alone don't prevent row removal, that requires
hot_standby_feedback to be used as well.

A minimal rephrasing would be:
<para>
Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
also using a replication slot, provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected. Replication slots overcome these
disadvantages.
</para>

+1, that's definitely an improvement.

--
Daniel Gustafsson

In reply to: Alvaro Herrera (#3)
Re: Should we remove vacuum_defer_cleanup_age?

On Sat, Mar 18, 2023 at 2:34 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2023-Mar-17, Andres Freund wrote:

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

+1 I agree it's not useful anymore.

+1.

I am suspicious of most of the GUCs whose value is an XID age. It
strikes me as something that is convenient to the implementation, but
not to the user, since there are so many ways that XID age might be a
poor proxy for whatever it is that you really care about in each case.

A theoretical advantage of vacuum_defer_cleanup_age is that it allows
the user to control things in terms of the impact on the primary --
whereas hot_standby_feedback is a mechanism that controls things in
terms of the needs of the standby. In practice this is pretty useless,
but it seems like it might be possible to come up with some other new
mechanism that somehow does this in a way that's truly useful.
Something that allows the user to constrain how far we hold back
conflicts/vacuuming in terms of the *impact* on the primary.

It might be helpful to permit opportunistic cleanup by pruning and
index deletion at some point, but to throttle it when we know it would
violate some soft limit related to hot_standby_feedback. Maybe the
system could prevent the first few attempts at pruning when it
violates the soft limit, or make pruning prune somewhat less
aggressively where there is little advantage to it in terms of
space/tuples freed -- decide on what to do at the very last minute,
based on all available information at that late stage, with the full
context available. The system could be taught to be very patient at
first, when relatively few pruning operations have been attempted,
when the cost is basically still acceptable. But as more pruning
operations ran and clearly didn't free space that really should be
freed, we'd quickly lose patience.

The big idea here is to delay committing to any course of action for
as long as possible, so we wouldn't kill queries on standbys for very
little benefit on the primary, while at the same time avoiding ever
really failing to kill queries on standbys when the cost proved too
high on the primary. For this to have any chance of working it needs
to focus on the actual costs on the primary, and not some extremely
noisy proxy for that cost. The standby will have its query killed by
just one prune record affecting just one heap page, and delaying that
specific prune record is likely no big deal. It's preventing pruning
of tens of thousands of heap pages that we need to worry about.

--
Peter Geoghegan

#10Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#5)
Re: Should we remove vacuum_defer_cleanup_age?

On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

I don't see any utility in waiting; it just makes the process of
removing it take longer for no reason.

As long as it's done before the betas, it seems completely reasonable to
remove it for v16.

--
Justin

#11Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#10)
Re: Should we remove vacuum_defer_cleanup_age?

Hi,

On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:

On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

I don't see any utility in waiting; it just makes the process of
removing it take longer for no reason.

As long as it's done before the betas, it seems completely reasonable to
remove it for v16.

Added the RMT.

We really should have a rmt@pg.o alias...

Updated patch attached. I think we should either apply something like that
patch, or at least add a <warning/> to the docs.

Greetings,

Andres Freund

Attachments:

v2-0001-Remove-vacuum_defer_cleanup_age.patchtext/x-diff; charset=us-asciiDownload+15-171
#12Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#11)
Re: Should we remove vacuum_defer_cleanup_age?

On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:

On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

I don't see any utility in waiting; it just makes the process of
removing it take longer for no reason.

As long as it's done before the betas, it seems completely reasonable to
remove it for v16.

Added the RMT.

We really should have a rmt@pg.o alias...

Updated patch attached. I think we should either apply something like that
patch, or at least add a <warning/> to the docs.

+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?

--
With Regards,
Amit Kapila.

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#11)
Re: Should we remove vacuum_defer_cleanup_age?

On 2023-Apr-11, Andres Freund wrote:

Updated patch attached. I think we should either apply something like that
patch, or at least add a <warning/> to the docs.

I gave this patch a look. The only code change is that
ComputeXidHorizons() and GetSnapshotData() no longer handle the case
where vacuum_defer_cleanup_age is different from zero. It looks good.
The TransactionIdRetreatSafely() code being removed is pretty weird (I
spent a good dozen minutes writing a complaint that your rewrite was
faulty, but it turns out I had misunderstood the function), so I'm glad
it's being retired.

<para>
-    Similarly, <xref linkend="guc-hot-standby-feedback"/>
-    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
-    relevant rows being removed by vacuum, but the former provides no
-    protection during any time period when the standby is not connected,
-    and the latter often needs to be set to a high value to provide adequate
-    protection.  Replication slots overcome these disadvantages.
+    Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
+    also using a replication slot, provides protection against relevant rows
+    being removed by vacuum, but provides no protection during any time period
+    when the standby is not connected.  Replication slots overcome these
+    disadvantages.

I think it made sense to have this paragraph be separate from the
previous one when it was talking about two separate variables, but now
that it's just one, it looks a bit isolated. I would merge it with the
one above, which is talking about pretty much the same thing, and
reorder the whole thing approximately like this

<para>
In lieu of using replication slots, it is possible to prevent the removal
of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
storing the segments in an archive using
<xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
However, these methods often result in retaining more WAL segments than
required.
Similarly, <xref linkend="guc-hot-standby-feedback"/> without
a replication slot provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected.
</para>
<para>
Replication slots overcome these disadvantages by retaining only the number
of segments known to be needed.
On the other hand, replication slots can retain so
many WAL segments that they fill up the space allocated
for <literal>pg_wal</literal>;
<xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
retained by replication slots.
</para>

Though the "However," looks a poor fit; I would do this:

<para>
In lieu of using replication slots, it is possible to prevent the removal
of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
storing the segments in an archive using
<xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
A disadvantage of these methods is that they often result in retaining
more WAL segments than required.
Similarly, <xref linkend="guc-hot-standby-feedback"/> without
a replication slot provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected.
</para>
<para>
Replication slots overcome these disadvantages by retaining only the number
of segments known to be needed.
On the other hand, replication slots can retain so
many WAL segments that they fill up the space allocated
for <literal>pg_wal</literal>;
<xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
retained by replication slots.
</para>

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes... Fixed.
/messages/by-id/482D1632.8010507@sigaev.ru

#14Jonathan S. Katz
jkatz@postgresql.org
In reply to: Amit Kapila (#12)
Re: Should we remove vacuum_defer_cleanup_age?

On 4/12/23 11:34 PM, Amit Kapila wrote:

On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:

On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

I don't see any utility in waiting; it just makes the process of
removing it take longer for no reason.

As long as it's done before the betas, it seems completely reasonable to
remove it for v16.

Added the RMT.

We really should have a rmt@pg.o alias...

(I had thought something as much -- will reach out to pginfra about options)

Updated patch attached. I think we should either apply something like that
patch, or at least add a <warning/> to the docs.

+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?

I need a bit more time to study this before formulating an opinion on
whether we should remove it for v16. In any case, I'm not against
documentation.

Jonathan

#15Jonathan S. Katz
jkatz@postgresql.org
In reply to: Jonathan S. Katz (#14)
Re: Should we remove vacuum_defer_cleanup_age?

On 4/13/23 11:32 AM, Jonathan S. Katz wrote:

On 4/12/23 11:34 PM, Amit Kapila wrote:

On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de>

+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?

I need a bit more time to study this before formulating an opinion on
whether we should remove it for v16. In any case, I'm not against
documentation.

(didn't need too much more time).

[RMT hat]

+1 for removing.

I looked at some data and it doesn't seem like vacuum_defer_cleanup_age
is used in any significant way, whereas hot_standby_feedback is much
more widely used. Given this, and all the problems + arguments made in
the thread, we should just get rid of it for v16.

There are cases where we should deprecate before removing, but I don't
think this one based upon usage and having a better alternative.

Per [1]/messages/by-id/20230317230930.nhsgk3qfk7f4axls@awork3.anarazel.de it does sound like we can make some improvements to
hot_standby_feedback, but those can wait to v17.

We should probably set $DATE to finish this, too. I don't think it's a
rush, but we should give enough time before Beta 1.

Jonathan

[1]: /messages/by-id/20230317230930.nhsgk3qfk7f4axls@awork3.anarazel.de
/messages/by-id/20230317230930.nhsgk3qfk7f4axls@awork3.anarazel.de

#16Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Jonathan S. Katz (#15)
Re: Should we remove vacuum_defer_cleanup_age?

On Thu, 2023-04-13 at 12:16 -0400, Jonathan S. Katz wrote:

On 4/13/23 11:32 AM, Jonathan S. Katz wrote:

On 4/12/23 11:34 PM, Amit Kapila wrote:

On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de>

+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?

I need a bit more time to study this before formulating an opinion on
whether we should remove it for v16. In any case, I'm not against
documentation.

[RMT hat]

+1 for removing.

I am not against this in principle, but I know that there are people using
this parameter; see the discussion linked in

/messages/by-id/E1jkzxE-0006Dw-Dg@gemulon.postgresql.org

I can't say if they have a good use case for that parameter or not.

Yours,
Laurenz Albe

#17Robert Haas
robertmhaas@gmail.com
In reply to: Laurenz Albe (#16)
Re: Should we remove vacuum_defer_cleanup_age?

On Thu, Apr 13, 2023 at 11:06 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

I am not against this in principle, but I know that there are people using
this parameter; see the discussion linked in

/messages/by-id/E1jkzxE-0006Dw-Dg@gemulon.postgresql.org

I can't say if they have a good use case for that parameter or not.

Yeah, I feel similarly. Actually, personally I have no evidence, not
even an anecdote, suggesting that this parameter is in use, but I'm a
bit skeptical of any consensus of the form "no one is using X,"
because there sure are a lot of people running PostgreSQL and they
sure do a lot of things. Some more justifiably than others, but often
people have surprisingly good excuses for doing stuff that sounds
bizarre when you first hear about it, and it doesn't seem totally
impossible that somebody could have found a way to get value out of
this.

However, I suspect that there aren't many such people, and I think the
setting is a kludge, so I support removing it. Maybe we'll find out
that we ought to add something else instead, like a limited delimited
in time rather than in XIDs. Or maybe the existing facilities are good
enough. But as Peter rightly says, XID age is likely a poor proxy for
whatever people really care about, so I don't think continuing to have
a setting that works like that is a good plan.

--
Robert Haas
EDB: http://www.enterprisedb.com

#18Daniel Gustafsson
daniel@yesql.se
In reply to: Robert Haas (#17)
Re: Should we remove vacuum_defer_cleanup_age?

On 14 Apr 2023, at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:

..as Peter rightly says, XID age is likely a poor proxy for
whatever people really care about, so I don't think continuing to have
a setting that works like that is a good plan.

Agreed, and removing it is likely to be a good vehicle for figuring out what we
should have instead (if anything).

--
Daniel Gustafsson

#19Jonathan S. Katz
jkatz@postgresql.org
In reply to: Robert Haas (#17)
Re: Should we remove vacuum_defer_cleanup_age?

On 4/14/23 8:30 AM, Robert Haas wrote:

On Thu, Apr 13, 2023 at 11:06 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

I am not against this in principle, but I know that there are people using
this parameter; see the discussion linked in

/messages/by-id/E1jkzxE-0006Dw-Dg@gemulon.postgresql.org

I can't say if they have a good use case for that parameter or not.

Yeah, I feel similarly. Actually, personally I have no evidence, not
even an anecdote, suggesting that this parameter is in use, but I'm a
bit skeptical of any consensus of the form "no one is using X,"
because there sure are a lot of people running PostgreSQL and they
sure do a lot of things. Some more justifiably than others, but often
people have surprisingly good excuses for doing stuff that sounds
bizarre when you first hear about it, and it doesn't seem totally
impossible that somebody could have found a way to get value out of
this.

Let me restate [1]/messages/by-id/bf42784f-4d57-0a3d-1a06-ffac1a09318c@postgresql.org in a different way.

Using a large enough dataset, I did qualitatively look at overall usage
of both "vacuum_defer_cleanup_age" and compared to
"hot_standby_feedback", given you can use both to accomplish similar
outcomes. The usage of "vacuum_defer_cleanup_age" was really minimal, in
fact approaching "0", whereas "hot_standby_feedback" had significant
adoption.

I'm not saying that "we should remove a setting just because it's not
used" or that it may not have utility -- I'm saying that we can remove
the setting given:

1. We know that using this setting incorrectly (which can be done fairly
easily) can cause significant issues
2. There's another setting that can accomplish similar goals that's much
safer
3. The setting itself is not widely used

It's the combination of all 3 that led to my conclusion. If it were just
(1), I'd lean more strongly towards trying to fix it first.

However, I suspect that there aren't many such people, and I think the
setting is a kludge, so I support removing it. Maybe we'll find out
that we ought to add something else instead, like a limited delimited
in time rather than in XIDs. Or maybe the existing facilities are good
enough. But as Peter rightly says, XID age is likely a poor proxy for
whatever people really care about, so I don't think continuing to have
a setting that works like that is a good plan.

That seems like a good eventual outcome.

Thanks,

Jonathan

[1]: /messages/by-id/bf42784f-4d57-0a3d-1a06-ffac1a09318c@postgresql.org
/messages/by-id/bf42784f-4d57-0a3d-1a06-ffac1a09318c@postgresql.org

#20Bruce Momjian
bruce@momjian.us
In reply to: Jonathan S. Katz (#19)
Re: Should we remove vacuum_defer_cleanup_age?

On Fri, 14 Apr 2023 at 09:47, Jonathan S. Katz <jkatz@postgresql.org> wrote:

Let me restate [1] in a different way.

Using a large enough dataset, I did qualitatively look at overall usage
of both "vacuum_defer_cleanup_age" and compared to
"hot_standby_feedback", given you can use both to accomplish similar
outcomes.

I assume people would use hot_standby_feedback if they have streaming
replication. The main use cases for vacuum_defer_cleanup_age would be
if you're replaying WAL files. That may sound archaic but there are
plenty of circumstances where your standby may not have network access
to your primary at all or not want to be replaying continuously.

I wonder whether your dataset is self-selecting sites that have
streaming replication. That does seem like the more common usage
pattern.

Systems using wal files are more likely to be things like data
warehouses, offline analytics systems, etc. They may not even be well
known in the same organization that runs the online operations -- in
my experience they're often run by marketing or sales organizations or
in some cases infosec teams and consume data from lots of sources. The
main reason to use wal archive replay is often to provide the
isolation so that the operations team don't need to worry about the
impact on production and that makes it easy to forget these even
exist.

--
greg

#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#20)
#22Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alvaro Herrera (#21)
#23Bruce Momjian
bruce@momjian.us
In reply to: Laurenz Albe (#22)
#24Jonathan S. Katz
jkatz@postgresql.org
In reply to: Laurenz Albe (#22)
#25Nathan Bossart
nathandbossart@gmail.com
In reply to: Jonathan S. Katz (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#13)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#26)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#27)
#29Phil Florent
philflorent@hotmail.com
In reply to: Andres Freund (#26)
#30Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#27)