Optimizing TransactionIdIsCurrentTransactionId()

Started by Simon Riggsabout 6 years ago9 messages
#1Simon Riggs
simon@2ndquadrant.com
1 attachment(s)

TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for
the case when an xid has not yet been assigned, so for read only
transactions.

A patch for this is attached.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Solutions for the Enterprise

Attachments:

optimize_TransactionIdIsCurrentTransactionId_readonly.v1.patchapplication/octet-stream; name=optimize_TransactionIdIsCurrentTransactionId_readonly.v1.patchDownload
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 5353b6ab0b..42d2174dd4 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -853,6 +853,7 @@ bool
 TransactionIdIsCurrentTransactionId(TransactionId xid)
 {
 	TransactionState s;
+	TransactionId topxid = GetTopTransactionIdIfAny();
 
 	/*
 	 * We always say that BootstrapTransactionId is "not my transaction ID"
@@ -867,10 +868,10 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
 	 * not my transaction ID, so we can just return "false" immediately for
 	 * any non-normal XID.
 	 */
-	if (!TransactionIdIsNormal(xid))
+	if (!TransactionIdIsNormal(xid) || !TransactionIdIsNormal(topxid))
 		return false;
 
-	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+	if (TransactionIdEquals(xid, topxid))
 		return true;
 
 	/*
#2Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#1)
Re: Optimizing TransactionIdIsCurrentTransactionId()

On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote:

TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned, so for read only transactions.

A patch for this is attached.

It might be an idea to first call TransactionIdIsNormal(xid), then
GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so
that we don't bother with GetTopTransactionIdIfAny() when
!TransactionIdIsNormal(xid).

But it's also not clear to me whether this is actually a win. You're
dong an extra TransactionIdIsNormal() test to sometimes avoid a
GetTopTransactionIdIfAny() test. TransactionIdIsNormal() is pretty
cheap, but GetTopTransactionIdIfAny() isn't all that expensive either,
and adding more branches costs something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Robert Haas (#2)
Re: Optimizing TransactionIdIsCurrentTransactionId()

On Thu, Dec 19, 2019 at 02:27:01PM -0500, Robert Haas wrote:

On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote:

TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned, so for read only transactions.

A patch for this is attached.

It might be an idea to first call TransactionIdIsNormal(xid), then
GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so
that we don't bother with GetTopTransactionIdIfAny() when
!TransactionIdIsNormal(xid).

But it's also not clear to me whether this is actually a win. You're
dong an extra TransactionIdIsNormal() test to sometimes avoid a
GetTopTransactionIdIfAny() test. TransactionIdIsNormal() is pretty
cheap, but GetTopTransactionIdIfAny() isn't all that expensive either,
and adding more branches costs something.

I think "optimization" patches should generally come with some sort of
quantification of the gains - e.g. a benchmark with somewhat realistic
workload (but even synthetic is better than nothing). Or at least some
explanation *why* it's going to be an improvement.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Simon Riggs
simon@2ndquadrant.com
In reply to: Robert Haas (#2)
Re: Optimizing TransactionIdIsCurrentTransactionId()

On Thu, 19 Dec 2019 at 19:27, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote:

TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized

for the case when an xid has not yet been assigned, so for read only
transactions.

A patch for this is attached.

It might be an idea to first call TransactionIdIsNormal(xid), then
GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so
that we don't bother with GetTopTransactionIdIfAny() when
!TransactionIdIsNormal(xid).

But it's also not clear to me whether this is actually a win. You're
dong an extra TransactionIdIsNormal() test to sometimes avoid a
GetTopTransactionIdIfAny() test.

That's not the point of the patch.

If the TopTransactionId is not assigned, we can leave the whole function
more quickly, not just avoid a test.

Read only transactions should have a fast path thru this function since
they frequently read more data than write transactions.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Solutions for the Enterprise

#5Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#4)
Re: Optimizing TransactionIdIsCurrentTransactionId()

On Fri, Dec 20, 2019 at 12:46 AM Simon Riggs <simon@2ndquadrant.com> wrote:

If the TopTransactionId is not assigned, we can leave the whole function more quickly, not just avoid a test.

Those things are not really any different from each other. You leave
the function when you've done all the necessary tests....

Read only transactions should have a fast path thru this function since they frequently read more data than write transactions.

With regard to this point, I second Tomas's comments.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Simon Riggs
simon@2ndquadrant.com
In reply to: Robert Haas (#5)
Re: Optimizing TransactionIdIsCurrentTransactionId()

On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:

Read only transactions should have a fast path thru this function since

they frequently read more data than write transactions.

With regard to this point, I second Tomas's comments.

I also agree with Tomas' comments. I am explaining *why* it will be an
improvement, expanding on my earlier notes.

This function is called extremely frequently in query processing and is
fairly efficient. I'm pointing out cases where making it even quicker makes
sense.

The TopXid is assigned in very few calls. Write transactions perform
searching before the xid is assigned, so UPDATE and DELETE transactions
will call this with TopXid unassigned in many small transactions, e.g.
simple pgbench. In almost all read-only cases and especially on standby
nodes there will be no TopXid assigned, so I estimate that 90-99% of calls
will be made with TopXid invalid. In this case it makes a great deal of
sense to have a fastpath out of this function, by testing
TransactionIdIsNormal(topxid).

I also now notice that on entry the xid provided is hardly ever
InvalidTransactionId. Once, it might have been called repeatedly with
FrozenTransactionId, but that is no longer the case since we no longer
reset the xid on freezing. So the test for TransactionIdIsNormal(xid)
appears to need rethinking since it is now mostly redundant.

So if adding a test is considered heavy, I would swap the test for
TransactionIdIsNormal(xid) and replace with a test for
TransactionIdIsNormal(topxid).

Such a frequently used function is worth discussing, just as we previously
optimised TransactionIdIsInProgress() and MVCC visibility routines, where
we discussed what the most common routes through the functions were before
deciding how to optimize them.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Solutions for the Enterprise

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#6)
Re: Optimizing TransactionIdIsCurrentTransactionId()

Simon Riggs <simon@2ndquadrant.com> writes:

On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:

With regard to this point, I second Tomas's comments.

I also agree with Tomas' comments. I am explaining *why* it will be an
improvement, expanding on my earlier notes.
This function is called extremely frequently in query processing and is
fairly efficient. I'm pointing out cases where making it even quicker makes
sense.

I think the point is that you haven't demonstrated that this particular
patch makes it quicker.

regards, tom lane

#8Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#7)
Re: Optimizing TransactionIdIsCurrentTransactionId()

On Fri, 20 Dec 2019 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:

With regard to this point, I second Tomas's comments.

I also agree with Tomas' comments. I am explaining *why* it will be an
improvement, expanding on my earlier notes.
This function is called extremely frequently in query processing and is
fairly efficient. I'm pointing out cases where making it even quicker

makes

sense.

I think the point is that you haven't demonstrated that this particular
patch makes it quicker.

Not yet, but I was trying to agree what an appropriate test would be before
running it.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Solutions for the Enterprise

#9Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Simon Riggs (#8)
Re: Optimizing TransactionIdIsCurrentTransactionId()

On Fri, Dec 20, 2019 at 05:57:55PM +0000, Simon Riggs wrote:

On Fri, 20 Dec 2019 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:

With regard to this point, I second Tomas's comments.

I also agree with Tomas' comments. I am explaining *why* it will be an
improvement, expanding on my earlier notes.
This function is called extremely frequently in query processing and is
fairly efficient. I'm pointing out cases where making it even quicker

makes

sense.

I think the point is that you haven't demonstrated that this particular
patch makes it quicker.

Not yet, but I was trying to agree what an appropriate test would be before
running it.

Isn't that a bit backwards? I mean, we usually identify opportunities
for optimizations by observing poor performance with a workload, which
means that workload can serve as a test. Of course, it's possible to
notice an opprtunity by eye-balling the code, but you've already said
this is supposed to improve read-only transactions.

I've actually tried to measure if/how this affects performance using a
simple read-only pgbench

pgbench -S -M prepared -T 60 test

I did this with a long-running transaction to prevent hint bits from
getting set. But I've not measured any difference in performane. So
either this improves a different workload, or maybe I'm doing something
silly that makes the patch irrelevant.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services