SUBTRANS: Minimizing calls to SubTransSetParent()
On Thu, 4 Aug 2022 at 13:11, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Wed, 3 Aug 2022 at 20:18, Andres Freund <andres@anarazel.de> wrote:
I think we should consider redesigning subtrans more substantially - even with
the changes you propose here, there's still plenty ways to hit really bad
performance. And there's only so much we can do about that without more
fundamental design changes.I completely agree - you will be glad to hear that I've been working
on a redesign of the subtrans module.
...
I will post my patch, when complete, in a different thread.
The attached patch reduces the overhead of SUBTRANS by minimizing the
number of times SubTransSetParent() is called, to below 1% of the
current rate in common cases.
Instead of blindly calling SubTransSetParent() for every subxid, this
proposal only calls SubTransSetParent() when that information will be
required for later use. It does this by analyzing all of the callers
of SubTransGetParent() and uses these pre-conditions to filter out
calls/subxids that will never be required, for various reasons. It
redesigns the way XactLockTableWait() calls
SubTransGetTopmostTransactionId() to allow this.
This short patchset compiles and passes make check-world, with lengthy comments.
This might then make viable a simple rewrite of SUBTRANS using a hash
table, as proposed by Andres. But in any case, it will allow us to
design a densely packed SUBTRANS replacement that does not generate as
much contention and I/O.
NOTE that this patchset does not touch SUBTRANS at all, it just
minimizes the calls in preparation for a later redesign in a later
patch. If this patch/later versions of it is committed in Sept CF,
then we should be in good shape to post a subtrans redesign patch by
major patch deadline at end of year.
Patches 001 and 002 are common elements of a different patch,
"Smoothing the subtrans performance catastrophe", but other than that,
the two patches are otherwise independent of each other.
Where does this come from? I learnt a lot about subxids when coding
Hot Standby, specifically commit 06da3c570f21394003. This patch just
builds upon that earlier understanding.
Comments please.
--
Simon Riggs http://www.EnterpriseDB.com/
Attachments:
003_minimize_calls_to_SubTransSetParent.v7.patchapplication/octet-stream; name=003_minimize_calls_to_SubTransSetParent.v7.patchDownload+135-5
002_new_isolation_tests_for_subxids.v2.patchapplication/octet-stream; name=002_new_isolation_tests_for_subxids.v2.patchDownload+64-0
001_subx_include_subxids_even_if_overflowed.v2.patchapplication/octet-stream; name=001_subx_include_subxids_even_if_overflowed.v2.patchDownload+29-29
On Mon, Aug 8, 2022 at 6:41 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Thu, 4 Aug 2022 at 13:11, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Wed, 3 Aug 2022 at 20:18, Andres Freund <andres@anarazel.de> wrote:
I think we should consider redesigning subtrans more substantially - even with
the changes you propose here, there's still plenty ways to hit really bad
performance. And there's only so much we can do about that without more
fundamental design changes.I completely agree - you will be glad to hear that I've been working
on a redesign of the subtrans module....
I will post my patch, when complete, in a different thread.
The attached patch reduces the overhead of SUBTRANS by minimizing the
number of times SubTransSetParent() is called, to below 1% of the
current rate in common cases.Instead of blindly calling SubTransSetParent() for every subxid, this
proposal only calls SubTransSetParent() when that information will be
required for later use. It does this by analyzing all of the callers
of SubTransGetParent() and uses these pre-conditions to filter out
calls/subxids that will never be required, for various reasons. It
redesigns the way XactLockTableWait() calls
SubTransGetTopmostTransactionId() to allow this.This short patchset compiles and passes make check-world, with lengthy comments.
Does this patch set work independently or it has dependency on the
patches on the other thread "Smoothing the subtrans performance
catastrophe"? Because in this patch I see no code where we are
changing anything to control the access of SubTransGetParent() from
SubTransGetTopmostTransactionId()?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, 9 Aug 2022 at 12:39, Dilip Kumar <dilipbalaut@gmail.com> wrote:
This short patchset compiles and passes make check-world, with lengthy comments.
Does this patch set work independently or it has dependency on the
patches on the other thread "Smoothing the subtrans performance
catastrophe"?
Patches 001 and 002 are common elements of a different patch,
"Smoothing the subtrans performance catastrophe", but other than that,
the two patches are otherwise independent of each other.
i.e. there are common elements in both patches
001 puts all subxid data in a snapshot (up to a limit of 64 xids per
topxid), even if one or more xids overflows.
Because in this patch I see no code where we are
changing anything to control the access of SubTransGetParent() from
SubTransGetTopmostTransactionId()?
Those calls are unaffected, i.e. they both still work.
Right now, we register all subxids in subtrans. But not all xids are
subxids, so in fact, subtrans has many "holes" in it, where if you
look up the parent for an xid it will just return
InvalidTransactionId. There is a protection against that causing a
problem because if you call TransactionIdDidCommit/Abort you can get a
WARNING, or if you call SubTransGetTopmostTransaction() you can get an
ERROR, but it is possible if you do a lookup for an inappropriate xid.
i.e. if you call TransactionIdDidCommit() without first calling
TransactionIdIsInProgress() as you are supposed to do.
What this patch does is increase the number of "holes" in subtrans,
reducing the overhead and making the subtrans data structure more
amenable to using a dense structure rather than a sparse structure as
we do now, which then leads to I/O overheads. But in this patch, we
only have holes when we can prove that the subxid's parent will never
be requested.
Specifically, with this patch, running PL/pgSQL with a few
subtransactions in will cause those subxids to be logged in subtrans
about 1% as often as they are now, so greatly reducing the number of
subtrans calls.
Happy to provide more detailed review thoughts, so please keep asking questions.
--
Simon Riggs http://www.EnterpriseDB.com/
On Tue, Aug 9, 2022 at 9:46 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
Those calls are unaffected, i.e. they both still work.
Right now, we register all subxids in subtrans. But not all xids are
subxids, so in fact, subtrans has many "holes" in it, where if you
look up the parent for an xid it will just return
c. There is a protection against that causing a
problem because if you call TransactionIdDidCommit/Abort you can get a
WARNING, or if you call SubTransGetTopmostTransaction() you can get an
ERROR, but it is possible if you do a lookup for an inappropriate xid.
i.e. if you call TransactionIdDidCommit() without first calling
TransactionIdIsInProgress() as you are supposed to do.
IIUC, if SubTransGetParent SubTransGetParent then
SubTransGetTopmostTransaction() loop will break and return the
previousxid. So if we pass any topxid to
SubTransGetTopmostTransaction() it will return back the same xid and
that's fine as next we are going to search in the snapshot->xip array.
But if we are calling this function with the subxid which might be
there in the snapshot->subxip array but if we are first calling
SubTransGetTopmostTransaction() then it will just return the same xid
if the parent is not set for it. And now if we search this in the
snapshot->xip array then we will get the wrong answer?
So I still think some adjustment is required in XidInMVCCSnapdhot()
such that we first search the snapshot->subxip array.
Am I still missing something?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, 10 Aug 2022 at 08:34, Dilip Kumar <dilipbalaut@gmail.com> wrote:
Am I still missing something?
No, you have found a dependency between the patches that I was unaware
of. So there is no bug if you apply both patches.
Thanks for looking.
So I still think some adjustment is required in XidInMVCCSnapdhot()
That is one way to resolve the issue, but not the only one. I can also
change AssignTransactionId() to recursively register parent xids for
all of a subxid's parents.
I will add in a test case and resolve the dependency in my next patch.
Thanks again.
--
Simon Riggs http://www.EnterpriseDB.com/
On Wed, Aug 10, 2022 at 6:31 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
On Wed, 10 Aug 2022 at 08:34, Dilip Kumar <dilipbalaut@gmail.com> wrote:
Am I still missing something?
No, you have found a dependency between the patches that I was unaware
of. So there is no bug if you apply both patches.
Right
So I still think some adjustment is required in XidInMVCCSnapdhot()
That is one way to resolve the issue, but not the only one. I can also
change AssignTransactionId() to recursively register parent xids for
all of a subxid's parents.I will add in a test case and resolve the dependency in my next patch.
Okay, thanks, I will look into the updated patch after you submit that.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, 11 Aug 2022 at 06:32, Dilip Kumar <dilipbalaut@gmail.com> wrote:
So I still think some adjustment is required in XidInMVCCSnapdhot()
That is one way to resolve the issue, but not the only one. I can also
change AssignTransactionId() to recursively register parent xids for
all of a subxid's parents.I will add in a test case and resolve the dependency in my next patch.
Okay, thanks, I will look into the updated patch after you submit that.
PFA two patches, replacing earlier work
001_new_isolation_tests_for_subxids.v3.patch
002_minimize_calls_to_SubTransSetParent.v8.patch
001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.
002_minimize_calls_to_SubTransSetParent.v8.patch
Reduces the number of calls to subtrans below 1% for the first 64 subxids,
so overall will substantially reduce subtrans contention on master for the
typical case, as well as smoothing the overflow case.
Some discussion needed on this; there are various options.
This combines the work originally posted here with another patch posted on the
thread "Smoothing the subtrans performance catastrophe".
I will do some performance testing also, but more welcome.
--
Simon Riggs http://www.EnterpriseDB.com/
On Tue, Aug 30, 2022 at 10:16 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
PFA two patches, replacing earlier work
001_new_isolation_tests_for_subxids.v3.patch
002_minimize_calls_to_SubTransSetParent.v8.patch001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.002_minimize_calls_to_SubTransSetParent.v8.patch
Reduces the number of calls to subtrans below 1% for the first 64 subxids,
so overall will substantially reduce subtrans contention on master for the
typical case, as well as smoothing the overflow case.
Some discussion needed on this; there are various options.
This combines the work originally posted here with another patch posted on the
thread "Smoothing the subtrans performance catastrophe".I will do some performance testing also, but more welcome.
Thanks for the updated patch, I have some questions/comments.
1.
+ * This has the downside that anyone waiting for a lock on aborted
+ * subtransactions would not be released immediately; that may or
+ * may not be an acceptable compromise. If not acceptable, this
+ * simple call needs to be replaced with a loop to register the
+ * parent for the current subxid stack, so we can walk
back up it to
+ * the topxid.
+ */
+ SubTransSetParent(subxid, GetTopTransactionId());
I do not understand in which situation we will see this downside. I
mean if we see the logic of XactLockTableWait() then in the current
situation also if the subtransaction is committed we directly wait on
the top transaction by calling SubTransGetTopmostTransaction(xid);
So if the lock-taking subtransaction is committed then we will wait
directly for the top-level transaction and after that, it doesn't
matter if we abort any of the parent subtransactions, because it will
wait for the topmost transaction to complete. And if the lock-taking
subtransaction is aborted then it will anyway stop waiting because
TransactionIdIsInProgress() should return false.
2.
/*
* Notice that we update pg_subtrans with the top-level xid, rather than
* the parent xid. This is a difference between normal processing and
* recovery, yet is still correct in all cases. The reason is that
* subtransaction commit is not marked in clog until commit processing, so
* all aborted subtransactions have already been clearly marked in clog.
* As a result we are able to refer directly to the top-level
* transaction's state rather than skipping through all the intermediate
* states in the subtransaction tree. This should be the first time we
* have attempted to SubTransSetParent().
*/
for (i = 0; i < nsubxids; i++)
SubTransSetParent(subxids[i], topxid);
I think this comment needs some modification because in this patch now
in normal processing also we are setting the topxid as a parent right?
3.
+ while (TransactionIdIsValid(parentXid))
+ {
+ previousXid = parentXid;
+
+ /*
+ * Stop as soon as we are earlier than the cutoff. This saves multiple
+ * lookups against subtrans when we have a deeply nested subxid with
+ * a later snapshot with an xmin much higher than TransactionXmin.
+ */
+ if (TransactionIdPrecedes(parentXid, cutoff_xid))
+ {
+ *xid = previousXid;
+ return true;
+ }
+ parentXid = SubTransGetParent(parentXid);
Do we need this while loop if we are directly setting topxid as a
parent, so with that, we do not need multiple iterations to go to the
top xid?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, 6 Sept 2022 at 12:37, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Aug 30, 2022 at 10:16 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:PFA two patches, replacing earlier work
001_new_isolation_tests_for_subxids.v3.patch
002_minimize_calls_to_SubTransSetParent.v8.patch001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.002_minimize_calls_to_SubTransSetParent.v8.patch
Reduces the number of calls to subtrans below 1% for the first 64 subxids,
so overall will substantially reduce subtrans contention on master for the
typical case, as well as smoothing the overflow case.
Some discussion needed on this; there are various options.
This combines the work originally posted here with another patch posted on the
thread "Smoothing the subtrans performance catastrophe".I will do some performance testing also, but more welcome.
Thanks for the updated patch, I have some questions/comments.
Thanks for the review.
1. + * This has the downside that anyone waiting for a lock on aborted + * subtransactions would not be released immediately; that may or + * may not be an acceptable compromise. If not acceptable, this + * simple call needs to be replaced with a loop to register the + * parent for the current subxid stack, so we can walk back up it to + * the topxid. + */ + SubTransSetParent(subxid, GetTopTransactionId());I do not understand in which situation we will see this downside. I
mean if we see the logic of XactLockTableWait() then in the current
situation also if the subtransaction is committed we directly wait on
the top transaction by calling SubTransGetTopmostTransaction(xid);So if the lock-taking subtransaction is committed then we will wait
directly for the top-level transaction and after that, it doesn't
matter if we abort any of the parent subtransactions, because it will
wait for the topmost transaction to complete. And if the lock-taking
subtransaction is aborted then it will anyway stop waiting because
TransactionIdIsInProgress() should return false.
Yes, correct.
2.
/*
* Notice that we update pg_subtrans with the top-level xid, rather than
* the parent xid. This is a difference between normal processing and
* recovery, yet is still correct in all cases. The reason is that
* subtransaction commit is not marked in clog until commit processing, so
* all aborted subtransactions have already been clearly marked in clog.
* As a result we are able to refer directly to the top-level
* transaction's state rather than skipping through all the intermediate
* states in the subtransaction tree. This should be the first time we
* have attempted to SubTransSetParent().
*/
for (i = 0; i < nsubxids; i++)
SubTransSetParent(subxids[i], topxid);I think this comment needs some modification because in this patch now
in normal processing also we are setting the topxid as a parent right?
Correct
3. + while (TransactionIdIsValid(parentXid)) + { + previousXid = parentXid; + + /* + * Stop as soon as we are earlier than the cutoff. This saves multiple + * lookups against subtrans when we have a deeply nested subxid with + * a later snapshot with an xmin much higher than TransactionXmin. + */ + if (TransactionIdPrecedes(parentXid, cutoff_xid)) + { + *xid = previousXid; + return true; + } + parentXid = SubTransGetParent(parentXid);Do we need this while loop if we are directly setting topxid as a
parent, so with that, we do not need multiple iterations to go to the
top xid?
Correct. I think we can dispense with
SubTransGetTopmostTransactionPrecedes() entirely.
I was initially trying to leave options open but that is confusing and
as a result, some parts are misleading after I merged the two patches.
I will update the patch, thanks for your scrutiny.
--
Simon Riggs http://www.EnterpriseDB.com/
On Tue, 6 Sept 2022 at 13:14, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
I will update the patch, thanks for your scrutiny.
I attach a diff showing what has changed between v8 and v9, and will
reattach a full set of new patches in the next post, so patchtester
doesn't squeal.
--
Simon Riggs http://www.EnterpriseDB.com/
Attachments:
subx.v8--v9.diffapplication/octet-stream; name=subx.v8--v9.diffDownload+46-99
On Tue, 13 Sept 2022 at 11:56, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Tue, 6 Sept 2022 at 13:14, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
I will update the patch, thanks for your scrutiny.
I attach a diff showing what has changed between v8 and v9, and will
reattach a full set of new patches in the next post, so patchtester
doesn't squeal.
Full set of v9 patches
--
Simon Riggs http://www.EnterpriseDB.com/
On 2022-Aug-30, Simon Riggs wrote:
001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.
I gave this a quick run to confirm the claimed increase of coverage. It
checks out, so pushed.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Aug-30, Simon Riggs wrote:
001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.I gave this a quick run to confirm the claimed increase of coverage. It
checks out, so pushed.
Thank you.
So now we just have the main part of the patch, reattached here for
the auto patch tester's benefit.
--
Simon Riggs http://www.EnterpriseDB.com/
Attachments:
002_minimize_calls_to_SubTransSetParent.v9.patchapplication/octet-stream; name=002_minimize_calls_to_SubTransSetParent.v9.patchDownload+236-103
On Thu, 15 Sep 2022 at 18:04, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Aug-30, Simon Riggs wrote:
001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.I gave this a quick run to confirm the claimed increase of coverage. It
checks out, so pushed.Thank you.
So now we just have the main part of the patch, reattached here for
the auto patch tester's benefit.
Hi Simon,
Thanks for the updated patch, here are some comments.
There is a typo, s/SubTransGetTopMostTransaction/SubTransGetTopmostTransaction/g.
+ * call SubTransGetTopMostTransaction() if that xact overflowed;
Is there a punctuation mark missing on the following first line?
+ * 2. When IsolationIsSerializable() we sometimes need to access topxid
+ * This occurs only when SERIALIZABLE is requested by app user.
When we use function name in comments, some places we use parentheses,
but others do not use it. Why? I think, we should keep them consistent,
at least in the same commit.
+ * 3. When TransactionIdSetTreeStatus will use a status of SUB_COMMITTED,
+ * which then requires us to consult subtrans to find parent, which
+ * is needed to avoid race condition. In this case we ask Clog/Xact
+ * module if TransactionIdsAreOnSameXactPage(). Since we start a new
+ * clog page every 32000 xids, this is usually <<1% of subxids.
Maybe we declaration a topxid to avoid calling GetTopTransactionId()
twice when we should set subtrans parent?
+ TransactionId subxid = XidFromFullTransactionId(s->fullTransactionId);
+ TransactionId topxid = GetTopTransactionId();
...
+ if (MyProc->subxidStatus.overflowed ||
+ IsolationIsSerializable() ||
+ !TransactionIdsAreOnSameXactPage(topxid, subxid))
+ {
...
+ SubTransSetParent(subxid, topxid);
+ }
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Thu, 15 Sept 2022 at 12:36, Japin Li <japinli@hotmail.com> wrote:
On Thu, 15 Sep 2022 at 18:04, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Aug-30, Simon Riggs wrote:
001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.I gave this a quick run to confirm the claimed increase of coverage. It
checks out, so pushed.Thank you.
So now we just have the main part of the patch, reattached here for
the auto patch tester's benefit.Hi Simon,
Thanks for the updated patch, here are some comments.
Thanks for your comments.
There is a typo, s/SubTransGetTopMostTransaction/SubTransGetTopmostTransaction/g.
+ * call SubTransGetTopMostTransaction() if that xact overflowed;
Is there a punctuation mark missing on the following first line?
+ * 2. When IsolationIsSerializable() we sometimes need to access topxid + * This occurs only when SERIALIZABLE is requested by app user.When we use function name in comments, some places we use parentheses,
but others do not use it. Why? I think, we should keep them consistent,
at least in the same commit.+ * 3. When TransactionIdSetTreeStatus will use a status of SUB_COMMITTED, + * which then requires us to consult subtrans to find parent, which + * is needed to avoid race condition. In this case we ask Clog/Xact + * module if TransactionIdsAreOnSameXactPage(). Since we start a new + * clog page every 32000 xids, this is usually <<1% of subxids.
I've reworded those comments, hoping to address all of your above points.
Maybe we declaration a topxid to avoid calling GetTopTransactionId()
twice when we should set subtrans parent?+ TransactionId subxid = XidFromFullTransactionId(s->fullTransactionId); + TransactionId topxid = GetTopTransactionId(); ... + if (MyProc->subxidStatus.overflowed || + IsolationIsSerializable() || + !TransactionIdsAreOnSameXactPage(topxid, subxid)) + { ... + SubTransSetParent(subxid, topxid); + }
Seems a minor point, but I've done this anyway.
Thanks for the review.
v10 attached
--
Simon Riggs http://www.EnterpriseDB.com/
Attachments:
002_minimize_calls_to_SubTransSetParent.v10.patchapplication/octet-stream; name=002_minimize_calls_to_SubTransSetParent.v10.patchDownload+238-103
On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
Thanks for the review.
v10 attached
v11 attached, corrected for recent commit
14ff44f80c09718d43d853363941457f5468cc03.
--
Simon Riggs http://www.EnterpriseDB.com/
Attachments:
002_minimize_calls_to_SubTransSetParent.v11.patchapplication/octet-stream; name=002_minimize_calls_to_SubTransSetParent.v11.patchDownload+238-103
Hi,
Le lun. 26 sept. 2022 à 15:57, Simon Riggs
<simon.riggs@enterprisedb.com> a écrit :
On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
Thanks for the review.
v10 attached
v11 attached, corrected for recent commit
14ff44f80c09718d43d853363941457f5468cc03.
Please find below the performance tests results I have produced for this patch.
Attaching some charts and the scripts used to reproduce these tests.
1. Assumption
The number of sub-transaction issued by only one long running
transaction may affect global TPS throughput if the number of
sub-transaction exceeds 64 (sub-overflow)
2. Testing scenario
Based on pgbench, 2 different types of DB activity are applied concurrently:
- 1 long running transaction, including N sub-transactions
- X pgbench clients running read-only workload
Tests are executed with a varying number of sub-transactions: from 0 to 128
Key metric is the TPS rate reported by pgbench runs in read-only mode
Tests are executed against
- HEAD (14a737)
- HEAD (14a737) + 002_minimize_calls_to_SubTransSetParent.v11.patch
3. Long transaction anatomy
Two different long transactions are tested because they don't have the
exact same impact on performance.
Transaction number 1 includes one UPDATE affecting each row of
pgbench_accounts, plus an additional UPDATE affecting only one row but
executed in its own rollbacked sub-transaction:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN - 1;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid = 12345;
ROLLBACK TO SAVEPOINT sN;
-- sleeping until the end of the test
ROLLBACK;
Transaction 2 includes one UPDATE affecting each row of pgbench_accounts:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
-- sleeping until the end of the test
ROLLBACK;
4. Test results with transaction 1
TPS vs number of sub-transaction
nsubx HEAD patched
--------------------
0 441109 439474
8 439045 438103
16 439123 436993
24 436269 434194
32 439707 437429
40 439997 437220
48 439388 437422
56 439409 437210
64 439748 437366
72 92869 434448
80 66577 434100
88 61243 434255
96 57016 434419
104 52132 434917
112 49181 433755
120 46581 434044
128 44067 434268
Perf profiling on HEAD with 80 sub-transactions:
Overhead Symbol
51.26% [.] LWLockAttemptLock
24.59% [.] LWLockRelease
0.36% [.] base_yyparse
0.35% [.] PinBuffer
0.34% [.] AllocSetAlloc
0.33% [.] hash_search_with_hash_value
0.22% [.] LWLockAcquire
0.20% [.] UnpinBuffer
0.15% [.] SimpleLruReadPage_ReadOnly
0.15% [.] _bt_compare
Perf profiling on patched with 80 sub-transactions:
Overhead Symbol
2.64% [.] AllocSetAlloc
2.09% [.] base_yyparse
1.76% [.] hash_search_with_hash_value
1.62% [.] LWLockAttemptLock
1.26% [.] MemoryContextAllocZeroAligned
0.93% [.] _bt_compare
0.92% [.] expression_tree_walker_impl.part.4
0.84% [.] SearchCatCache1
0.79% [.] palloc
0.64% [.] core_yylex
5. Test results with transaction 2
nsubx HEAD patched
--------------------
0 440145 443816
8 438867 443081
16 438634 441786
24 436406 440187
32 439203 442447
40 439819 443574
48 439314 442941
56 439801 443736
64 439074 441970
72 439833 444132
80 148737 439941
88 413714 443343
96 251098 442021
104 70190 443488
112 405507 438866
120 177827 443202
128 399431 441842
From the performance point of view, this patch clearly fixes the
dramatic TPS collapse shown in these tests.
Regards,
--
Julien Tachoires
EDB
On Fri, Oct 28, 2022 at 10:55 PM Julien Tachoires <julmon@gmail.com> wrote:
Hi,
Le lun. 26 sept. 2022 à 15:57, Simon Riggs
<simon.riggs@enterprisedb.com> a écrit :On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
Thanks for the review.
v10 attached
v11 attached, corrected for recent commit
14ff44f80c09718d43d853363941457f5468cc03.Please find below the performance tests results I have produced for this patch.
Attaching some charts and the scripts used to reproduce these tests.1. Assumption
The number of sub-transaction issued by only one long running
transaction may affect global TPS throughput if the number of
sub-transaction exceeds 64 (sub-overflow)2. Testing scenario
Based on pgbench, 2 different types of DB activity are applied concurrently:
- 1 long running transaction, including N sub-transactions
- X pgbench clients running read-only workloadTests are executed with a varying number of sub-transactions: from 0 to 128
Key metric is the TPS rate reported by pgbench runs in read-only modeTests are executed against
- HEAD (14a737)
- HEAD (14a737) + 002_minimize_calls_to_SubTransSetParent.v11.patch3. Long transaction anatomy
Two different long transactions are tested because they don't have the
exact same impact on performance.Transaction number 1 includes one UPDATE affecting each row of
pgbench_accounts, plus an additional UPDATE affecting only one row but
executed in its own rollbacked sub-transaction:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN - 1;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid = 12345;
ROLLBACK TO SAVEPOINT sN;
-- sleeping until the end of the test
ROLLBACK;Transaction 2 includes one UPDATE affecting each row of pgbench_accounts:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
-- sleeping until the end of the test
ROLLBACK;4. Test results with transaction 1
TPS vs number of sub-transaction
nsubx HEAD patched
--------------------
0 441109 439474
8 439045 438103
16 439123 436993
24 436269 434194
32 439707 437429
40 439997 437220
48 439388 437422
56 439409 437210
64 439748 437366
72 92869 434448
80 66577 434100
88 61243 434255
96 57016 434419
104 52132 434917
112 49181 433755
120 46581 434044
128 44067 434268Perf profiling on HEAD with 80 sub-transactions:
Overhead Symbol
51.26% [.] LWLockAttemptLock
24.59% [.] LWLockRelease
0.36% [.] base_yyparse
0.35% [.] PinBuffer
0.34% [.] AllocSetAlloc
0.33% [.] hash_search_with_hash_value
0.22% [.] LWLockAcquire
0.20% [.] UnpinBuffer
0.15% [.] SimpleLruReadPage_ReadOnly
0.15% [.] _bt_comparePerf profiling on patched with 80 sub-transactions:
Overhead Symbol
2.64% [.] AllocSetAlloc
2.09% [.] base_yyparse
1.76% [.] hash_search_with_hash_value
1.62% [.] LWLockAttemptLock
1.26% [.] MemoryContextAllocZeroAligned
0.93% [.] _bt_compare
0.92% [.] expression_tree_walker_impl.part.4
0.84% [.] SearchCatCache1
0.79% [.] palloc
0.64% [.] core_yylex5. Test results with transaction 2
nsubx HEAD patched
--------------------
0 440145 443816
8 438867 443081
16 438634 441786
24 436406 440187
32 439203 442447
40 439819 443574
48 439314 442941
56 439801 443736
64 439074 441970
72 439833 444132
80 148737 439941
88 413714 443343
96 251098 442021
104 70190 443488
112 405507 438866
120 177827 443202
128 399431 441842From the performance point of view, this patch clearly fixes the
dramatic TPS collapse shown in these tests.
I think these are really promising results. Although the perf result
shows that the bottleneck on the SLRU is no more there with the patch,
I think it would be nice to see the wait event as well.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Hi,
Le mar. 1 nov. 2022 à 09:39, Dilip Kumar <dilipbalaut@gmail.com> a écrit :
On Fri, Oct 28, 2022 at 10:55 PM Julien Tachoires <julmon@gmail.com> wrote:
Hi,
Le lun. 26 sept. 2022 à 15:57, Simon Riggs
<simon.riggs@enterprisedb.com> a écrit :On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
Thanks for the review.
v10 attached
v11 attached, corrected for recent commit
14ff44f80c09718d43d853363941457f5468cc03.Please find below the performance tests results I have produced for this patch.
Attaching some charts and the scripts used to reproduce these tests.1. Assumption
The number of sub-transaction issued by only one long running
transaction may affect global TPS throughput if the number of
sub-transaction exceeds 64 (sub-overflow)2. Testing scenario
Based on pgbench, 2 different types of DB activity are applied concurrently:
- 1 long running transaction, including N sub-transactions
- X pgbench clients running read-only workloadTests are executed with a varying number of sub-transactions: from 0 to 128
Key metric is the TPS rate reported by pgbench runs in read-only modeTests are executed against
- HEAD (14a737)
- HEAD (14a737) + 002_minimize_calls_to_SubTransSetParent.v11.patch3. Long transaction anatomy
Two different long transactions are tested because they don't have the
exact same impact on performance.Transaction number 1 includes one UPDATE affecting each row of
pgbench_accounts, plus an additional UPDATE affecting only one row but
executed in its own rollbacked sub-transaction:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN - 1;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid = 12345;
ROLLBACK TO SAVEPOINT sN;
-- sleeping until the end of the test
ROLLBACK;Transaction 2 includes one UPDATE affecting each row of pgbench_accounts:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
-- sleeping until the end of the test
ROLLBACK;4. Test results with transaction 1
TPS vs number of sub-transaction
nsubx HEAD patched
--------------------
0 441109 439474
8 439045 438103
16 439123 436993
24 436269 434194
32 439707 437429
40 439997 437220
48 439388 437422
56 439409 437210
64 439748 437366
72 92869 434448
80 66577 434100
88 61243 434255
96 57016 434419
104 52132 434917
112 49181 433755
120 46581 434044
128 44067 434268Perf profiling on HEAD with 80 sub-transactions:
Overhead Symbol
51.26% [.] LWLockAttemptLock
24.59% [.] LWLockRelease
0.36% [.] base_yyparse
0.35% [.] PinBuffer
0.34% [.] AllocSetAlloc
0.33% [.] hash_search_with_hash_value
0.22% [.] LWLockAcquire
0.20% [.] UnpinBuffer
0.15% [.] SimpleLruReadPage_ReadOnly
0.15% [.] _bt_comparePerf profiling on patched with 80 sub-transactions:
Overhead Symbol
2.64% [.] AllocSetAlloc
2.09% [.] base_yyparse
1.76% [.] hash_search_with_hash_value
1.62% [.] LWLockAttemptLock
1.26% [.] MemoryContextAllocZeroAligned
0.93% [.] _bt_compare
0.92% [.] expression_tree_walker_impl.part.4
0.84% [.] SearchCatCache1
0.79% [.] palloc
0.64% [.] core_yylex5. Test results with transaction 2
nsubx HEAD patched
--------------------
0 440145 443816
8 438867 443081
16 438634 441786
24 436406 440187
32 439203 442447
40 439819 443574
48 439314 442941
56 439801 443736
64 439074 441970
72 439833 444132
80 148737 439941
88 413714 443343
96 251098 442021
104 70190 443488
112 405507 438866
120 177827 443202
128 399431 441842From the performance point of view, this patch clearly fixes the
dramatic TPS collapse shown in these tests.I think these are really promising results. Although the perf result
shows that the bottleneck on the SLRU is no more there with the patch,
I think it would be nice to see the wait event as well.
Please find attached samples returned by the following query when
testing transaction 1 with 80 subxacts:
SELECT wait_event_type, wait_event, locktype, mode, database,
relation, COUNT(*) from pg_stat_activity AS psa JOIN pg_locks AS pl ON
(psa.pid = pl.pid) GROUP BY 1, 2, 3, 4, 5, 6 ORDER BY 7 DESC;
Regards,
--
Julien Tachoires
EDB
On Tue, 1 Nov 2022 at 08:55, Julien Tachoires <julmon@gmail.com> wrote:
4. Test results with transaction 1
TPS vs number of sub-transaction
nsubx HEAD patched
--------------------
0 441109 439474
8 439045 438103
16 439123 436993
24 436269 434194
32 439707 437429
40 439997 437220
48 439388 437422
56 439409 437210
64 439748 437366
72 92869 434448
80 66577 434100
88 61243 434255
96 57016 434419
104 52132 434917
112 49181 433755
120 46581 434044
128 44067 434268Perf profiling on HEAD with 80 sub-transactions:
Overhead Symbol
51.26% [.] LWLockAttemptLock
24.59% [.] LWLockRelease
0.36% [.] base_yyparse
0.35% [.] PinBuffer
0.34% [.] AllocSetAlloc
0.33% [.] hash_search_with_hash_value
0.22% [.] LWLockAcquire
0.20% [.] UnpinBuffer
0.15% [.] SimpleLruReadPage_ReadOnly
0.15% [.] _bt_comparePerf profiling on patched with 80 sub-transactions:
Overhead Symbol
2.64% [.] AllocSetAlloc
2.09% [.] base_yyparse
1.76% [.] hash_search_with_hash_value
1.62% [.] LWLockAttemptLock
1.26% [.] MemoryContextAllocZeroAligned
0.93% [.] _bt_compare
0.92% [.] expression_tree_walker_impl.part.4
0.84% [.] SearchCatCache1
0.79% [.] palloc
0.64% [.] core_yylex5. Test results with transaction 2
nsubx HEAD patched
--------------------
0 440145 443816
8 438867 443081
16 438634 441786
24 436406 440187
32 439203 442447
40 439819 443574
48 439314 442941
56 439801 443736
64 439074 441970
72 439833 444132
80 148737 439941
88 413714 443343
96 251098 442021
104 70190 443488
112 405507 438866
120 177827 443202
128 399431 441842From the performance point of view, this patch clearly fixes the
dramatic TPS collapse shown in these tests.I think these are really promising results. Although the perf result
shows that the bottleneck on the SLRU is no more there with the patch,
I think it would be nice to see the wait event as well.Please find attached samples returned by the following query when
testing transaction 1 with 80 subxacts:
SELECT wait_event_type, wait_event, locktype, mode, database,
relation, COUNT(*) from pg_stat_activity AS psa JOIN pg_locks AS pl ON
(psa.pid = pl.pid) GROUP BY 1, 2, 3, 4, 5, 6 ORDER BY 7 DESC;
These results are compelling, thank you.
Setting this to Ready for Committer.
--
Simon Riggs http://www.EnterpriseDB.com/