Broken O(n^2) avoidance in wal segment recycling.

Started by Andres Freundalmost 9 years ago8 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Branch: master Release: REL9_5_BR [b2a5545bd] 2015-04-13 16:53:49 +0300
Branch: REL9_4_STABLE Release: REL9_4_2 [d72792d02] 2015-04-13 17:22:21 +0300
Branch: REL9_3_STABLE Release: REL9_3_7 [a800267e4] 2015-04-13 17:22:35 +0300
Branch: REL9_2_STABLE Release: REL9_2_11 [cc2939f44] 2015-04-13 17:26:59 +0300
Branch: REL9_1_STABLE Release: REL9_1_16 [ad2925e20] 2015-04-13 17:26:49 +0300
Branch: REL9_0_STABLE Release: REL9_0_20 [5b6938186] 2015-04-13 17:26:35 +0300

Don't archive bogus recycled or preallocated files after timeline switch.

Moved xlog file deletion from RemoveOldXlogFiles() into its own
RemoveXlogFile() routine, because it introduced a new function also
deleting/recycling segments.

It did so moving
+       /* Needn't recheck that slot on future iterations */
+       endlogSegNo++;
into the new routine. But it's useless there, because it's just a stack
variable, which is going to be freshly computed with
+   XLByteToPrevSeg(endptr, endlogSegNo);
on the next call.

That logic was introduced in

commit 61b861421b0b849a0dffe36238b8e504624831c1
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2005-04-15 18:48:10 +0000

Modify MoveOfflineLogs/InstallXLogFileSegment to avoid O(N^2) behavior
when recycling a large number of xlog segments during checkpoint.
The former behavior searched from the same start point each time,
requiring O(checkpoint_segments^2) stat() calls to relocate all the
segments. Instead keep track of where we stopped last time through.

but was neutered by the commit above.

We've not heard any complaints about this afaik, but it's not something
that's easily diagnosable as being a problem. Therefore I suspect we
should fix and backpatch this?

Heikki?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#1)
Re: Broken O(n^2) avoidance in wal segment recycling.

On Thu, Jun 22, 2017 at 6:10 AM, Andres Freund <andres@anarazel.de> wrote:

We've not heard any complaints about this afaik, but it's not something
that's easily diagnosable as being a problem. Therefore I suspect we
should fix and backpatch this?

Agreed. I have just poked at this problem and have finished with the
attached. Logically it is not complicated at the argument values used
by the callers of RemoveXlogFile() are never updated when scanning
pg_wal. Surely this patch needs an extra pair of eyes.
--
Michael

Attachments:

fix-wal-recycle-perf.patchtext/x-patch; charset=US-ASCII; name=fix-wal-recycle-perf.patchDownload+29-23
#3Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#2)
Re: Broken O(n^2) avoidance in wal segment recycling.

On Fri, Jun 23, 2017 at 3:25 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Thu, Jun 22, 2017 at 6:10 AM, Andres Freund <andres@anarazel.de> wrote:

We've not heard any complaints about this afaik, but it's not something
that's easily diagnosable as being a problem. Therefore I suspect we
should fix and backpatch this?

Agreed. I have just poked at this problem and have finished with the
attached. Logically it is not complicated at the argument values used
by the callers of RemoveXlogFile() are never updated when scanning
pg_wal. Surely this patch needs an extra pair of eyes.

Andres has pointed me out offline that a bad copy-paste re-introduced
fb886c15. So it is important to properly track the segment name of the
switch point as well as the the segment from where the recycling
begins. There was as well a second mistake in RemoveOldXlogFiles()
causing unnecessary segments to be kept caused by the same copy-pasto.
--
Michael

Attachments:

fix-wal-recycle-perf-v2.patchtext/x-patch; charset=US-ASCII; name=fix-wal-recycle-perf-v2.patchDownload+33-25
#4Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#3)
Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.

Hi,

I just hit this bad a couple times during some testing. Under load, with
2500 segments to recycle, it took well over a minute.

I think we ought to backpatch a version of this fix. Yes, there've not
been many complaints, but there's no messages during log levels one can
enable without beeing completely flooded, so I don't think it's every
likely to be debugged on a live system.

Noteworth is that this stall due to the O(n^2) happens while holding
ControlFileLock.

I'm inclined to think we should even backpatch this.

Heikki, Michael, any comment?

- Andres

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.

Andres Freund <andres@anarazel.de> writes:

I think we ought to backpatch a version of this fix.

Uh ... what fix are you talking about?

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.

Hi,

On 2019-05-06 22:54:43 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I think we ought to backpatch a version of this fix.

Uh ... what fix are you talking about?

Michael's email had a proposed patch. I think there's a few small
changes needed, but otherwise it looks like the right direction to me.

Greetings,

Andres Freund

#7Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#4)
Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.

On Tue, May 7, 2019 at 2:45 PM Andres Freund <andres@anarazel.de> wrote:

I just hit this bad a couple times during some testing. Under load, with
2500 segments to recycle, it took well over a minute.

I wonder if this played a part in the wal_recycle=off-for-ZFS thing.

--
Thomas Munro
https://enterprisedb.com

#8Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#6)
Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.

On Mon, May 06, 2019 at 08:00:23PM -0700, Andres Freund wrote:

Michael's email had a proposed patch. I think there's a few small
changes needed, but otherwise it looks like the right direction to me.

I would not mind seeing this stuff fixed and back-patched.
--
Michael