A micro-optimisation for ProcSendSignal()

Started by Thomas Munroabout 5 years ago9 messageshackers
Jump to latest
#1Thomas Munro
thomas.munro@gmail.com

Hi,

ProcSendSignal(pid) searches the ProcArray for the given pid and then
sets that backend's procLatch. It has only two users: UnpinBuffer()
and ReleasePredicateLocks(). In both cases, we could just as easily
have recorded the pgprocno instead, avoiding the locking and the
searching. We'd also be able to drop some special book-keeping for
the startup process, whose pid can't be found via the ProcArray.

A related idea, saving space in BufferDesc but having to do slightly
more expensive work, would be for UnpinBuffer() to reuse the new
condition variable instead of ProcSendSignal().

Attachments:

0001-Optimize-ProcSendSignal.patchtext/x-patch; charset=US-ASCII; name=0001-Optimize-ProcSendSignal.patchDownload+16-61
#2Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#1)
Re: A micro-optimisation for ProcSendSignal()

On Fri, Mar 12, 2021 at 12:31 AM Thomas Munro <thomas.munro@gmail.com> wrote:

ProcSendSignal(pid) searches the ProcArray for the given pid and then
sets that backend's procLatch. It has only two users: UnpinBuffer()
and ReleasePredicateLocks(). In both cases, we could just as easily
have recorded the pgprocno instead, avoiding the locking and the
searching. We'd also be able to drop some special book-keeping for
the startup process, whose pid can't be found via the ProcArray.

Rebased.

Attachments:

v2-0001-Optimize-ProcSendSignal.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Optimize-ProcSendSignal.patchDownload+17-61
#3Soumyadeep Chakraborty
soumyadeep2007@gmail.com
In reply to: Thomas Munro (#2)
Re: A micro-optimisation for ProcSendSignal()

Hi Thomas,

You might have missed a spot to initialize SERIALIZABLE_XACT->pgprocno in
InitPredicateLocks(), so:

+ PredXact->OldCommittedSxact->pgprocno = INVALID_PGPROCNO;

Slightly tangential: we should add a comment to PGPROC.pgprocno, for more
immediate understandability:

+ int pgprocno; /* index of this PGPROC in ProcGlobal->allProcs */

Also, why don't we take the opportunity to get rid of SERIALIZABLEXACT->pid? We
took a stab. Attached is v2 of your patch with these changes.

Regards,
Ashwin and Deep

Attachments:

v2-0001-Optimize-ProcSendSignal.patchapplication/x-patch; name=v2-0001-Optimize-ProcSendSignal.patchDownload+34-73
#4Thomas Munro
thomas.munro@gmail.com
In reply to: Soumyadeep Chakraborty (#3)
Re: A micro-optimisation for ProcSendSignal()

Hi Soumyadeep and Ashwin,

Thanks for looking!

On Sun, Jul 18, 2021 at 6:58 AM Soumyadeep Chakraborty
<soumyadeep2007@gmail.com> wrote:

You might have missed a spot to initialize SERIALIZABLE_XACT->pgprocno in
InitPredicateLocks(), so:

+ PredXact->OldCommittedSxact->pgprocno = INVALID_PGPROCNO;

The magic OldCommittedSxact shouldn't be the target of a "signal", but
this is definitely tidier. Thanks.

Slightly tangential: we should add a comment to PGPROC.pgprocno, for more
immediate understandability:

+ int pgprocno; /* index of this PGPROC in ProcGlobal->allProcs */

I wonder why we need this member anyway, when you can compute it from
the address... #define GetPGProcNumber(p) ((p) - ProcGlobal->allProcs)
or something like that? Kinda wonder why we don't use
GetPGProcByNumber() in more places instead of open-coding access to
ProcGlobal->allProcs, too...

Also, why don't we take the opportunity to get rid of SERIALIZABLEXACT->pid? We
took a stab. Attached is v2 of your patch with these changes.

SERIALIZABLEXACT objects can live longer than the backends that
created them. They hang around to sabotage other transactions' plans,
depending on what else they overlapped with before they committed.
With that change, the pg_locks view might show the pid of some
unrelated session that moves into the same PGPROC.

It's only an "informational" pid, and pids are imperfect information
anyway because (1) they are themselves recycled, and (2) they won't be
interesting in a hypothetical multi-threaded future. One solution
would be to hide the pids from the view after the backend disconnects
(somehow -- add a generation number?), but they're also still kinda
useful, despite the weaknesses. I wonder what the ideal way would be
to refer to sessions, anyway, including those that are no longer
active; perhaps we could invent a new "session ID" concept.

Attachments:

v3-0001-Optimize-ProcSendSignal.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Optimize-ProcSendSignal.patchDownload+19-61
#5Soumyadeep Chakraborty
soumyadeep2007@gmail.com
In reply to: Thomas Munro (#4)
Re: A micro-optimisation for ProcSendSignal()

HI Thomas,

On Tue, Jul 20, 2021 at 10:40 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Slightly tangential: we should add a comment to PGPROC.pgprocno, for more
immediate understandability:

+ int pgprocno; /* index of this PGPROC in ProcGlobal->allProcs */

I wonder why we need this member anyway, when you can compute it from
the address... #define GetPGProcNumber(p) ((p) - ProcGlobal->allProcs)
or something like that? Kinda wonder why we don't use
GetPGProcByNumber() in more places instead of open-coding access to
ProcGlobal->allProcs, too...

I tried this out. See attached v4 of your patch with these changes.

Also, why don't we take the opportunity to get rid of SERIALIZABLEXACT->pid? We
took a stab. Attached is v2 of your patch with these changes.

SERIALIZABLEXACT objects can live longer than the backends that
created them. They hang around to sabotage other transactions' plans,
depending on what else they overlapped with before they committed.
With that change, the pg_locks view might show the pid of some
unrelated session that moves into the same PGPROC.

I see.

It's only an "informational" pid, and pids are imperfect information
anyway because (1) they are themselves recycled, and (2) they won't be
interesting in a hypothetical multi-threaded future. One solution
would be to hide the pids from the view after the backend disconnects
(somehow -- add a generation number?), but they're also still kinda
useful, despite the weaknesses. I wonder what the ideal way would be
to refer to sessions, anyway, including those that are no longer
active; perhaps we could invent a new "session ID" concept.

Updating the pg_locks view:

Yes, the pids may be valuable for future debugging/audit purposes. Also,
systems where pid_max is high enough to not see wraparound, will have
pids that are not recycled. I would lean towards showing the pid even
after the backend has exited.

Perhaps we could overload the stored pid to be negated (i.e. a backend
with pid 20000 will become -20000) to indicate that the pid belongs to
a backend that has exited. Additionally, we could introduce a boolean
field in pg_locks "backendisalive", so that the end user doesn't have
to reason about negative pids.

Session ID:

Interesting, Greenplum uses the concept of session ID pervasively. Being
a distributed database, the session ID helps to tie individual backends
on each PG instance to the same session. The session ID of course comes
with its share of bookkeeping:

* These IDs are incrementally dished out with a counter
(with pg_atomic_add_fetch_u32), in increments of 1, on the Greenplum
coordinator PG instance in InitProcess.

* The counter is a part of ProcGlobal and itself is initialized to 0 in
InitProcGlobal, which means that session IDs are reset on cluster
restart.

* The sessionID tied to each proceess is maintained in PGPROC.

* The sessionID finds its way into PgBackendStatus, which is further
reported with pg_stat_get_activity.

A session ID seems a bit heavy just to indicate whether a backend has
exited.

Regards,
Soumyadeep

Attachments:

v4-0001-Optimize-ProcSendSignal.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Optimize-ProcSendSignal.patchDownload+42-82
#6Thomas Munro
thomas.munro@gmail.com
In reply to: Soumyadeep Chakraborty (#5)
Re: A micro-optimisation for ProcSendSignal()

Hi Soumyadeep,

On Sat, Jul 24, 2021 at 5:26 PM Soumyadeep Chakraborty
<soumyadeep2007@gmail.com> wrote:

On Tue, Jul 20, 2021 at 10:40 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I wonder why we need this member anyway, when you can compute it from
the address... #define GetPGProcNumber(p) ((p) - ProcGlobal->allProcs)
or something like that? Kinda wonder why we don't use
GetPGProcByNumber() in more places instead of open-coding access to
ProcGlobal->allProcs, too...

I tried this out. See attached v4 of your patch with these changes.

I like it. I've moved these changes to a separate patch, 0002, for
separate commit. I tweaked a couple of comments (it's not a position
in the "procarray", well it's a position stored in the procarray, but
that's confusing; I also found a stray comment about leader->pgprocno
that is obsoleted by this change). Does anyone have objections to
this?

I was going to commit the earlier change this morning, but then I read [1]/messages/by-id/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de.

New idea. Instead of adding pgprocno to SERIALIZABLEXACT (which we
should really be trying to shrink, not grow), let's look it up by
vxid->backendId. I didn't consider that before, because I was trying
not to get tangled up with BackendIds for various reasons, not least
that that's yet another lock + O(n) search.

It seems likely that getting from vxid to latch will be less clumsy in
the near future. That refactoring and harmonising of backend
identifiers seems like a great idea to me. Here's a version that
anticipates that, using vxid->backendId to wake a sleeping
SERIALIZABLE READ ONLY DEFERRABLE backend, without having to add a new
member to the struct.

A session ID seems a bit heavy just to indicate whether a backend has
exited.

Yeah. A Greenplum-like session ID might eventually be necessary in a
world where sessions are divorced from processes and handled by a pool
of worker threads, though. /me gazes towards the horizon

[1]: /messages/by-id/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de

Attachments:

v5-0001-Optimize-ProcSendSignal.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Optimize-ProcSendSignal.patchDownload+19-63
v5-0002-Remove-PGPROC-s-redundant-pgprocno-member.patchtext/x-patch; charset=US-ASCII; name=v5-0002-Remove-PGPROC-s-redundant-pgprocno-member.patchDownload+31-30
#7Soumyadeep Chakraborty
soumyadeep2007@gmail.com
In reply to: Thomas Munro (#6)
Re: A micro-optimisation for ProcSendSignal()

Hey Thomas,

On Mon, Aug 2, 2021 at 6:45 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Hi Soumyadeep,

On Sat, Jul 24, 2021 at 5:26 PM Soumyadeep Chakraborty
<soumyadeep2007@gmail.com> wrote:

On Tue, Jul 20, 2021 at 10:40 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I wonder why we need this member anyway, when you can compute it from
the address... #define GetPGProcNumber(p) ((p) - ProcGlobal->allProcs)
or something like that? Kinda wonder why we don't use
GetPGProcByNumber() in more places instead of open-coding access to
ProcGlobal->allProcs, too...

I tried this out. See attached v4 of your patch with these changes.

I like it. I've moved these changes to a separate patch, 0002, for
separate commit. I tweaked a couple of comments (it's not a position
in the "procarray", well it's a position stored in the procarray, but
that's confusing; I also found a stray comment about leader->pgprocno
that is obsoleted by this change). Does anyone have objections to
this?

Awesome, thanks! Looks good.

I was going to commit the earlier change this morning, but then I read [1].

New idea. Instead of adding pgprocno to SERIALIZABLEXACT (which we
should really be trying to shrink, not grow), let's look it up by
vxid->backendId. I didn't consider that before, because I was trying
not to get tangled up with BackendIds for various reasons, not least
that that's yet another lock + O(n) search.

It seems likely that getting from vxid to latch will be less clumsy in
the near future. That refactoring and harmonising of backend
identifiers seems like a great idea to me. Here's a version that
anticipates that, using vxid->backendId to wake a sleeping
SERIALIZABLE READ ONLY DEFERRABLE backend, without having to add a new
member to the struct.

Neat! A Vxid -> PGPROC lookup eventually becomes an O(1) operation with the
changes proposed at the ending paragraph of [1]/messages/by-id/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de.

[1]: /messages/by-id/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de

Regards,
Soumyadeep (VMware)

#8Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#6)
Re: A micro-optimisation for ProcSendSignal()

Hi,

On 2021-08-03 13:44:58 +1200, Thomas Munro wrote:

New idea. Instead of adding pgprocno to SERIALIZABLEXACT (which we
should really be trying to shrink, not grow), let's look it up by
vxid->backendId. I didn't consider that before, because I was trying
not to get tangled up with BackendIds for various reasons, not least
that that's yet another lock + O(n) search.

It seems likely that getting from vxid to latch will be less clumsy in
the near future.

So this change only makes sense of vxids would start to use pgprocno instead
of backendid, basically?

From b284d8f29efc1c16c3aa75b64d9a940bcb74872c Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Tue, 3 Aug 2021 10:02:15 +1200
Subject: [PATCH v5 1/2] Optimize ProcSendSignal().

Instead of referring to target backends by pid, use pgprocno. This
means that we don't have to scan the ProcArray, and we can drop some
special case code for dealing with the startup process.

In the case of buffer pin waits, we switch to storing the pgprocno of
the waiter. In the case of SERIALIZABLE READ ONLY DEFERRABLE waits, we
derive the pgprocno from the vxid (though that's not yet as efficient as
it could be).

That's kind of an understatement :)

-ProcSendSignal(int pid)
+ProcSendSignal(int pgprocno)
{
-	PGPROC	   *proc = NULL;
-
-	if (RecoveryInProgress())
-	{
-		SpinLockAcquire(ProcStructLock);
-
-		/*
-		 * Check to see whether it is the Startup process we wish to signal.
-		 * This call is made by the buffer manager when it wishes to wake up a
-		 * process that has been waiting for a pin in so it can obtain a
-		 * cleanup lock using LockBufferForCleanup(). Startup is not a normal
-		 * backend, so BackendPidGetProc() will not return any pid at all. So
-		 * we remember the information for this special case.
-		 */
-		if (pid == ProcGlobal->startupProcPid)
-			proc = ProcGlobal->startupProc;
-
-		SpinLockRelease(ProcStructLock);
-	}
-
-	if (proc == NULL)
-		proc = BackendPidGetProc(pid);
-
-	if (proc != NULL)
-	{
-		SetLatch(&proc->procLatch);
-	}
+	SetLatch(&ProcGlobal->allProcs[pgprocno].procLatch);
}

I think some validation of the pgprocno here would be a good idea. I'm worried
that something could cause e.g. INVALID_PGPROCNO to be passed in, and suddenly
we're modifying random memory. That could end up being a pretty hard bug to
catch, because we might not even notice that the right latch isn't set...

From 562657ea3f7be124a6c6b6d1e7450da2431a54a0 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 11 Mar 2021 23:09:11 +1300
Subject: [PATCH v5 2/2] Remove PGPROC's redundant pgprocno member.

It's derivable with pointer arithmetic.

Author: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
Discussion:
/messages/by-id/CA+hUKGLYRyDaneEwz5Uya_OgFLMx5BgJfkQSD=q9HmwsfRRb-w@mail.gmail.com

/* Accessor for PGPROC given a pgprocno. */
#define GetPGProcByNumber(n) (&ProcGlobal->allProcs[(n)])
+/* Accessor for pgprocno given a pointer to PGPROC. */
+#define GetPGProcNumber(proc) ((proc) - ProcGlobal->allProcs)

I'm not sure this is a good idea. There's no really cheap way for the compiler
to compute this, because sizeof(PGPROC) isn't a power of two. Given that
PGPROC is ~880 bytes, I don't see all that much gain in getting rid of
->pgprocno.

Yes, compilers can optimize away the super expensive division, but it'll end
up as something like subtraction, shift, multiplication - not that cheap
either. And I suspect it'll often have to first load the ProcGlobal via the
GOT as well...

Greetings,

Andres Freund

#9Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#8)
Re: A micro-optimisation for ProcSendSignal()

On Tue, Aug 3, 2021 at 2:56 PM Andres Freund <andres@anarazel.de> wrote:

On 2021-08-03 13:44:58 +1200, Thomas Munro wrote:

In the case of buffer pin waits, we switch to storing the pgprocno of
the waiter. In the case of SERIALIZABLE READ ONLY DEFERRABLE waits, we
derive the pgprocno from the vxid (though that's not yet as efficient as
it could be).

That's kind of an understatement :)

I abandoned the vxid part for now and went back to v3. If/when
BackendId is replaced with or becomes synonymous with pgprocno, we can
make this change and drop the pgprocno member from SERIALIZABLEXACT.

+ SetLatch(&ProcGlobal->allProcs[pgprocno].procLatch);

I think some validation of the pgprocno here would be a good idea. I'm worried
that something could cause e.g. INVALID_PGPROCNO to be passed in, and suddenly
we're modifying random memory. That could end up being a pretty hard bug to
catch, because we might not even notice that the right latch isn't set...

Added.

/* Accessor for PGPROC given a pgprocno. */
#define GetPGProcByNumber(n) (&ProcGlobal->allProcs[(n)])
+/* Accessor for pgprocno given a pointer to PGPROC. */
+#define GetPGProcNumber(proc) ((proc) - ProcGlobal->allProcs)

I'm not sure this is a good idea. There's no really cheap way for the compiler
to compute this, because sizeof(PGPROC) isn't a power of two. Given that
PGPROC is ~880 bytes, I don't see all that much gain in getting rid of
->pgprocno.

Yeah, that would need some examination; 0002 patch abandoned for now.

Pushed.