Re: POC: Cache data in GetSnapshotData()

Started by Mithun Cyabout 10 years ago37 messageshackers
Jump to latest
#1Mithun Cy
mithun.cy@enterprisedb.com

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I tried to implement a simple patch which protect the cache. Of all the
backend which
compute the snapshot(when cache is invalid) only one of them will write to
cache.
This is done with one atomic compare and swap operation.

After above fix memory corruption is not visible. But I see some more
failures at higher client sessions(128 it is easily reproducible).

Errors are
DETAIL: Key (bid)=(24) already exists.
STATEMENT: UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid
= $2;
client 17 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(24) already exists.
client 26 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(87) already exists.
ERROR: duplicate key value violates unique constraint
"pgbench_branches_pkey"
DETAIL: Key (bid)=(113) already exists.

After some analysis I think In GetSnapshotData() while computing snapshot.
/*
* We don't include our own XIDs (if any) in the snapshot,
but we
* must include them in xmin.
*/
if (NormalTransactionIdPrecedes(xid, xmin))
xmin = xid;
*********** if (pgxact == MyPgXact) ******************
continue;

We do not add our own xid to xip array, I am wondering if other backend try
to use
the same snapshot will it be able to see changes made by me(current
backend).
I think since we compute a snapshot which will be used by other backends we
need to
add our xid to xip array to tell transaction is open.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

cache_snapshot_data_avoid_cuncurrent_write_to_cache.patchtext/x-patch; charset=US-ASCII; name=cache_snapshot_data_avoid_cuncurrent_write_to_cache.patchDownload+101-41
#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Mithun Cy (#1)

On Fri, Jan 15, 2016 at 11:23 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I tried to implement a simple patch which protect the cache. Of all the
backend which
compute the snapshot(when cache is invalid) only one of them will write to
cache.
This is done with one atomic compare and swap operation.

After above fix memory corruption is not visible. But I see some more
failures at higher client sessions(128 it is easily reproducible).

Don't you think we need to update the snapshot fields like count,
subcount before saving it to shared memory?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Mithun Cy (#1)

On Fri, Jan 15, 2016 at 11:23 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I tried to implement a simple patch which protect the cache. Of all the
backend which
compute the snapshot(when cache is invalid) only one of them will write to
cache.
This is done with one atomic compare and swap operation.

After above fix memory corruption is not visible. But I see some more
failures at higher client sessions(128 it is easily reproducible).

Errors are
DETAIL: Key (bid)=(24) already exists.
STATEMENT: UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid
= $2;
client 17 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(24) already exists.
client 26 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(87) already exists.
ERROR: duplicate key value violates unique constraint
"pgbench_branches_pkey"
DETAIL: Key (bid)=(113) already exists.

After some analysis I think In GetSnapshotData() while computing snapshot.
/*
* We don't include our own XIDs (if any) in the snapshot,
but we
* must include them in xmin.
*/
if (NormalTransactionIdPrecedes(xid, xmin))
xmin = xid;
*********** if (pgxact == MyPgXact) ******************
continue;

We do not add our own xid to xip array, I am wondering if other backend
try to use
the same snapshot will it be able to see changes made by me(current
backend).
I think since we compute a snapshot which will be used by other backends
we need to
add our xid to xip array to tell transaction is open.

I also think this observation of yours is right and currently that is
okay because we always first check TransactionIdIsCurrentTransactionId().
Refer comments on top of XidInMVCCSnapshot() [1]Note: GetSnapshotData never stores either top xid or subxids of our own backend into a snapshot, so these xids will not be reported as "running" by this function. This is OK for current uses, because we always check TransactionIdIsCurrentTransactionId first, except for known-committed XIDs which could not be ours anyway.. However, I
am not sure if it is good idea to start including current backends xid
in snapshot, because that can lead to including its subxids as well
which can make snapshot size bigger for cases where current transaction
has many sub-transactions. One way could be that we store current
backends transaction and sub-transaction id's only for the saved
snapshot, does that sound reasonable to you?

Other than that, I think patch needs to clear saved snapshot for
Commit Prepared Transaction as well (refer function
FinishPreparedTransaction()).

! else if (!snapshot->takenDuringRecovery)
{
int *pgprocnos = arrayP->pgprocnos;
int numProcs;
+ const uint32 snapshot_cached= 0;

I don't think the const is required for above variable.

[1]: Note: GetSnapshotData never stores either top xid or subxids of our own backend into a snapshot, so these xids will not be reported as "running" by this function. This is OK for current uses, because we always check TransactionIdIsCurrentTransactionId first, except for known-committed XIDs which could not be ours anyway.
Note: GetSnapshotData never stores either top xid or subxids of our own
backend into a snapshot, so these xids will not be reported as "running"
by this function. This is OK for current uses, because we always check
TransactionIdIsCurrentTransactionId first, except for known-committed
XIDs which could not be ours anyway.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#4Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Amit Kapila (#3)

On Sat, Jan 16, 2016 at 10:23 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Fri, Jan 15, 2016 at 11:23 AM, Mithun Cy <mithun.cy@enterprisedb.com>

wrote:

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I also think this observation of yours is right and currently that is
okay because we always first check TransactionIdIsCurrentTransactionId().

+ const uint32 snapshot_cached= 0;

I have fixed all of the issues reported by regress test. Also now when
backend try to cache the snapshot we also try to store the self-xid and sub
xid, so other backends can use them.

I also did some read-only perf tests.

Non-Default Settings.
================
scale_factor=300.

./postgres -c shared_buffers=16GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c
checkpoint_completion_target=0.9

./pgbench -c $clients -j $clients -T 300 -M prepared -S postgres

Machine Detail:
cpu : POWER8
cores: 24 (192 with HT).

Clients Base With cached snapshot
1 19653.914409 19926.884664
16 190764.519336 190040.299297
32 339327.881272 354467.445076
48 462632.02493 464767.917813
64 522642.515148 533271.556703
80 515262.813189 513353.962521

But did not see any perf improvement. Will continue testing the same.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Cache_data_in_GetSnapshotData_POC.patchtext/x-patch; charset=US-ASCII; name=Cache_data_in_GetSnapshotData_POC.patchDownload+127-10
#5Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#4)

On Thu, Feb 25, 2016 at 12:57 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

I have fixed all of the issues reported by regress test. Also now when
backend try to cache the snapshot we also try to store the self-xid and sub
xid, so other backends can use them.

I also did some read-only perf tests.

I'm not sure what you are doing that keeps breaking threading for
Gmail, but this thread is getting split up for me into multiple
threads with only a few messages in each. The same seems to be
happening in the community archives. Please try to figure out what is
causing that and stop doing it.

I notice you seem to not to have implemented this suggestion by Andres:

http://www.postgresql.org//message-id/20160104085845.m5nrypvmmpea5nm7@alap3.anarazel.de

Also, you should test this on a machine with more than 2 cores.
Andres's original test seems to have been on a 4-core system, where
this would be more likely to work out.

Also, Andres suggested testing this on an 80-20 write mix, where as
you tested it on 100% read-only. In that case there is no blocking
ProcArrayLock, which reduces the chances that the patch will benefit.

I suspect, too, that the chances of this patch working out have
probably been reduced by 0e141c0fbb211bdd23783afa731e3eef95c9ad7a,
which seems to be targeting mostly the same problem. I mean it's
possible that this patch could win even when no ProcArrayLock-related
blocking is happening, but the original idea seems to have been that
it would help mostly with that case. You could try benchmarking it on
the commit just before that one and then on current sources and see if
you get the same results on both, or if there was more benefit before
that commit.

Also, you could consider repeating the LWLOCK_STATS testing that Amit
did in his original reply to Andres. That would tell you whether the
performance is not increasing because the patch doesn't reduce
ProcArrayLock contention, or whether the performance is not increasing
because the patch DOES reduce ProcArrayLock contention but then the
contention shifts to CLogControlLock or elsewhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#5)

On Fri, Feb 26, 2016 at 2:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 25, 2016 at 12:57 PM, Mithun Cy <mithun.cy@enterprisedb.com>

wrote:

I have fixed all of the issues reported by regress test. Also now when
backend try to cache the snapshot we also try to store the self-xid and

sub

xid, so other backends can use them.

+ /* Add my own XID to snapshot. */
+ csnap->xip[count] = MyPgXact->xid;
+ csnap->xcnt = count + 1;

Don't we need to add this only when the xid of current transaction is
valid? Also, I think it will be better if we can explain why we need to
add the our own transaction id while caching the snapshot.

I also did some read-only perf tests.

I'm not sure what you are doing that keeps breaking threading for
Gmail, but this thread is getting split up for me into multiple
threads with only a few messages in each. The same seems to be
happening in the community archives. Please try to figure out what is
causing that and stop doing it.

I notice you seem to not to have implemented this suggestion by Andres:

http://www.postgresql.org//message-id/20160104085845.m5nrypvmmpea5nm7@alap3.anarazel.de

As far as I can understand by reading the patch, it is kind of already
implemented the first suggestion by Andres which is to use try lock, now
the patch is using atomic ops to implement the same instead of using
lwlock, but I think it should show the same impact, do you see any problem
with the same?

Now talking about second suggestion which is to use some form of 'snapshot
slots' to reduce the contention further, it seems that could be beneficial,
if see any gain with try lock. If you think that can benefit over current
approach taken in patch, then we can discuss how to implement the same.

Also, you should test this on a machine with more than 2 cores.
Andres's original test seems to have been on a 4-core system, where
this would be more likely to work out.

Also, Andres suggested testing this on an 80-20 write mix, where as
you tested it on 100% read-only. In that case there is no blocking
ProcArrayLock, which reduces the chances that the patch will benefit.

+1

Also, you could consider repeating the LWLOCK_STATS testing that Amit
did in his original reply to Andres. That would tell you whether the
performance is not increasing because the patch doesn't reduce
ProcArrayLock contention, or whether the performance is not increasing
because the patch DOES reduce ProcArrayLock contention but then the
contention shifts to CLogControlLock or elsewhere.

+1

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#7Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Amit Kapila (#6)

On Tue, Mar 1, 2016 at 12:59 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Don't we need to add this only when the xid of current transaction is

valid? Also, I think it will be better if we can explain why we need to
add the our >own transaction id while caching the snapshot.
I have fixed the same thing and patch is attached.

Some more tests done after that

*pgbench write tests: on 8 socket, 64 core machine.*

/postgres -c shared_buffers=16GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c
checkpoint_completion_target=0.9

./pgbench -c $clients -j $clients -T 1800 -M prepared postgres

[image: Inline image 3]

A small improvement in performance at 64 thread.

*LWLock_Stats data:*

ProcArrayLock: Base.

=================

postgresql-2016-03-01_115252.log:PID 110019 lwlock main 4: shacq 1867601
exacq 35625 blk 134682 spindelay 128 dequeue self 28871

postgresql-2016-03-01_115253.log:PID 110115 lwlock main 4: shacq 2201613
exacq 43489 blk 155499 spindelay 127 dequeue self 32751

postgresql-2016-03-01_115253.log:PID 110122 lwlock main 4: shacq 2231327
exacq 44824 blk 159440 spindelay 128 dequeue self 33336

postgresql-2016-03-01_115254.log:PID 110126 lwlock main 4: shacq 2247983
exacq 44632 blk 158669 spindelay 131 dequeue self 33365

postgresql-2016-03-01_115254.log:PID 110059 lwlock main 4: shacq 2036809
exacq 38607 blk 143538 spindelay 117 dequeue self 31008

ProcArrayLock: With Patch.

=====================

postgresql-2016-03-01_124747.log:PID 1789 lwlock main 4: shacq 2273958
exacq 61605 blk 79581 spindelay 307 dequeue self 66088

postgresql-2016-03-01_124748.log:PID 1880 lwlock main 4: shacq 2456388
exacq 65996 blk 82300 spindelay 470 dequeue self 68770

postgresql-2016-03-01_124748.log:PID 1765 lwlock main 4: shacq 2244083
exacq 60835 blk 79042 spindelay 336 dequeue self 65212

postgresql-2016-03-01_124749.log:PID 1882 lwlock main 4: shacq 2489271
exacq 67043 blk 85650 spindelay 463 dequeue self 68401

postgresql-2016-03-01_124749.log:PID 1753 lwlock main 4: shacq 2232791
exacq 60647 blk 78659 spindelay 364 dequeue self 65180

postgresql-2016-03-01_124750.log:PID 1849 lwlock main 4: shacq 2374922
exacq 64075 blk 81860 spindelay 339 dequeue self 67584

*-------------Block time of ProcArrayLock has reduced significantly.*

ClogControlLock : Base.

===================

postgresql-2016-03-01_115302.log:PID 110040 lwlock main 11: shacq 586129
exacq 268808 blk 90570 spindelay 261 dequeue self 59619

postgresql-2016-03-01_115303.log:PID 110047 lwlock main 11: shacq 593492
exacq 271019 blk 89547 spindelay 268 dequeue self 59999

postgresql-2016-03-01_115303.log:PID 110078 lwlock main 11: shacq 620830
exacq 285244 blk 92939 spindelay 262 dequeue self 61912

postgresql-2016-03-01_115304.log:PID 110083 lwlock main 11: shacq 633101
exacq 289983 blk 93485 spindelay 262 dequeue self 62394

postgresql-2016-03-01_115305.log:PID 110105 lwlock main 11: shacq 646584
exacq 297598 blk 93331 spindelay 312 dequeue self 63279

ClogControlLock : With Patch.

=======================

postgresql-2016-03-01_124730.log:PID 1865 lwlock main 11: shacq 722881
exacq 330273 blk 106163 spindelay 468 dequeue self 80316

postgresql-2016-03-01_124731.log:PID 1857 lwlock main 11: shacq 713720
exacq 327158 blk 106719 spindelay 439 dequeue self 79996

postgresql-2016-03-01_124732.log:PID 1826 lwlock main 11: shacq 696762
exacq 317239 blk 104523 spindelay 448 dequeue self 79374

postgresql-2016-03-01_124732.log:PID 1862 lwlock main 11: shacq 721272
exacq 330350 blk 105965 spindelay 492 dequeue self 81036

postgresql-2016-03-01_124733.log:PID 1879 lwlock main 11: shacq 737398
exacq 335357 blk 105424 spindelay 520 dequeue self 80977

*-------------Block time of ClogControlLock has increased slightly.*

Will continue with further tests on lower clients.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

image.pngimage/png; name=image.pngDownload
Cache_data_in_GetSnapshotData_POC_01.patchapplication/octet-stream; name=Cache_data_in_GetSnapshotData_POC_01.patchDownload+144-10
#8Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#7)

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Tue, Mar 1, 2016 at 12:59 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Don't we need to add this only when the xid of current transaction is

valid? Also, I think it will be better if we can explain why we need to
add the our >own transaction id while caching the snapshot.
I have fixed the same thing and patch is attached.

Some more tests done after that

*pgbench write tests: on 8 socket, 64 core machine.*

/postgres -c shared_buffers=16GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c
checkpoint_completion_target=0.9

./pgbench -c $clients -j $clients -T 1800 -M prepared postgres

[image: Inline image 3]

A small improvement in performance at 64 thread.

What if you apply both this and Amit's clog control log patch(es)? Maybe
the combination of the two helps substantially more than either one alone.

Or maybe not, but seems worth checking.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

image.pngimage/png; name=image.pngDownload
#9Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Robert Haas (#8)

On Thu, Mar 3, 2016 at 11:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

What if you apply both this and Amit's clog control log patch(es)? Maybe

the combination of the two helps substantially more than either >one alone.

I did the above tests along with Amit's clog patch. Machine :8 socket, 64
core. 128 hyperthread.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
64 29247.658034 30855.728835 5.4981181711 29254.532186 0.0235032562
32691.832776 11.7758992463
88 31214.305391 33313.393095 6.7247618606 32109.248609 2.8670931702
35433.655203 13.5173592978
128 30896.673949 34015.362152 10.0939285832 *** *** 34947.296355
13.110221549
256 27183.780921 31192.895437 14.7481857938 *** *** 32873.782735
20.9316056164
With clog changes, perf of caching the snapshot patch improves.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#10Amit Kapila
amit.kapila16@gmail.com
In reply to: Mithun Cy (#9)

On Thu, Mar 10, 2016 at 1:04 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 11:50 PM, Robert Haas <robertmhaas@gmail.com>
wrote:

What if you apply both this and Amit's clog control log patch(es)? Maybe

the combination of the two helps substantially more than either >one alone.

I did the above tests along with Amit's clog patch. Machine :8 socket, 64
core. 128 hyperthread.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
64 29247.658034 30855.728835 5.4981181711 29254.532186 0.0235032562
32691.832776 11.7758992463
88 31214.305391 33313.393095 6.7247618606 32109.248609 2.8670931702
35433.655203 13.5173592978
128 30896.673949 34015.362152 10.0939285832 *** *** 34947.296355
13.110221549
256 27183.780921 31192.895437 14.7481857938 *** *** 32873.782735
20.9316056164
With clog changes, perf of caching the snapshot patch improves.

This data looks promising to me and indicates that saving the snapshot has
benefits and we can see noticeable performance improvement especially once
the CLog contention gets reduced. I wonder if we should try these tests
with unlogged tables, because I suspect most of the contention after
CLogControlLock and ProcArrayLock is for WAL related locks, so you might be
able to see better gain with these patches. Do you think it makes sense to
check the performance by increasing CLOG buffers (patch for same is posted
in Speed up Clog thread [1]/messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com) as that also relieves contention on CLOG as
per the tests I have done?

[1]: /messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com
/messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#11Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Amit Kapila (#10)

Thanks Amit,
I did a quick pgbench write tests for unlogged tables at 88 clients as it
had the peak performance from previous test. There is big jump in TPS due
to clog changes.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
88 36055.425005 52796.618434 46.4318294034 37728.004118 4.6389111008
56025.454917 55.3870323515
Clients + WITH INCREASE IN CLOG BUFFER % increase
88 58217.924138 61.4678626862

I will continue to benchmark above tests with much wider range of clients.

On Thu, Mar 10, 2016 at 1:36 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Mar 10, 2016 at 1:04 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 11:50 PM, Robert Haas <robertmhaas@gmail.com>
wrote:

What if you apply both this and Amit's clog control log patch(es)?

Maybe the combination of the two helps substantially more than either >one
alone.

I did the above tests along with Amit's clog patch. Machine :8 socket, 64
core. 128 hyperthread.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
64 29247.658034 30855.728835 5.4981181711 29254.532186 0.0235032562
32691.832776 11.7758992463
88 31214.305391 33313.393095 6.7247618606 32109.248609 2.8670931702
35433.655203 13.5173592978
128 30896.673949 34015.362152 10.0939285832 *** *** 34947.296355
13.110221549
256 27183.780921 31192.895437 14.7481857938 *** *** 32873.782735
20.9316056164
With clog changes, perf of caching the snapshot patch improves.

This data looks promising to me and indicates that saving the snapshot has
benefits and we can see noticeable performance improvement especially once
the CLog contention gets reduced. I wonder if we should try these tests
with unlogged tables, because I suspect most of the contention after
CLogControlLock and ProcArrayLock is for WAL related locks, so you might be
able to see better gain with these patches. Do you think it makes sense to
check the performance by increasing CLOG buffers (patch for same is posted
in Speed up Clog thread [1]) as that also relieves contention on CLOG as
per the tests I have done?

[1] -
/messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#12Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Mithun Cy (#11)

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

I will continue to benchmark above tests with much wider range of clients.

Latest Benchmarking shows following results for unlogged tables.

clients BASE ONLY CLOG CHANGES % Increase CLOG CHANGES + SAVE SNAPSHOT %
Increase
1 1198.326337 1328.069656 10.8270439357 1234.078342 2.9834948875
32 37455.181727 38295.250519 2.2428640131 41023.126293 9.5259037641
64 48838.016451 50675.845885 3.7631123611 51662.814319 5.7840143259
88 36878.187766 53173.577363 44.1870671639 56025.454917 51.9203038731
128 35901.537773 52026.024098 44.9130798434 53864.486733 50.0339263281
256 28130.354402 46793.134156 66.3439197647 46817.04602 66.4289235427
Performance diff in 1 client seems just a run to run variance.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#13Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#12)

On Wed, Mar 16, 2016 at 4:40 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

I will continue to benchmark above tests with much wider range of

clients.

Latest Benchmarking shows following results for unlogged tables.

clients BASE ONLY CLOG CHANGES % Increase CLOG CHANGES + SAVE SNAPSHOT %
Increase
1 1198.326337 1328.069656 10.8270439357 1234.078342 2.9834948875
32 37455.181727 38295.250519 2.2428640131 41023.126293 9.5259037641
64 48838.016451 50675.845885 3.7631123611 51662.814319 5.7840143259
88 36878.187766 53173.577363 44.1870671639 56025.454917 51.9203038731
128 35901.537773 52026.024098 44.9130798434 53864.486733 50.0339263281
256 28130.354402 46793.134156 66.3439197647 46817.04602 66.4289235427

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#13)

On 2016-03-16 13:29:22 -0400, Robert Haas wrote:

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

What's the specifics of the machine tested? I wonder if it either
correlates with the number of hardware threads, real cores, or cache
sizes.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#14)

On Wed, Mar 16, 2016 at 1:33 PM, Andres Freund <andres@anarazel.de> wrote:

On 2016-03-16 13:29:22 -0400, Robert Haas wrote:

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

What's the specifics of the machine tested? I wonder if it either
correlates with the number of hardware threads, real cores, or cache
sizes.

I think this was done on cthulhu, whose /proc/cpuinfo output ends this way:

processor : 127
vendor_id : GenuineIntel
cpu family : 6
model : 47
model name : Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
stepping : 2
microcode : 0x37
cpu MHz : 2129.000
cache size : 24576 KB
physical id : 3
siblings : 16
core id : 25
cpu cores : 8
apicid : 243
initial apicid : 243
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64
monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic popcnt aes lahf_lm ida arat epb dtherm tpr_shadow vnmi
flexpriority ept vpid
bogomips : 4266.62
clflush size : 64
cache_alignment : 64
address sizes : 44 bits physical, 48 bits virtual
power management:

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#13)

On Wed, Mar 16, 2016 at 10:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 16, 2016 at 4:40 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

I will continue to benchmark above tests with much wider range of

clients.

Latest Benchmarking shows following results for unlogged tables.

clients BASE ONLY CLOG CHANGES % Increase CLOG CHANGES + SAVE SNAPSHOT %
Increase
1 1198.326337 1328.069656 10.8270439357 1234.078342 2.9834948875
32 37455.181727 38295.250519 2.2428640131 41023.126293 9.5259037641
64 48838.016451 50675.845885 3.7631123611 51662.814319 5.7840143259
88 36878.187766 53173.577363 44.1870671639 56025.454917 51.9203038731
128 35901.537773 52026.024098 44.9130798434 53864.486733 50.0339263281
256 28130.354402 46793.134156 66.3439197647 46817.04602 66.4289235427

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

If you see, for the Base readings, there is a performance increase up till
64 clients and then there is a fall at 88 clients, which to me indicates
that it hits very high-contention around CLogControlLock at 88 clients
which CLog patch is able to control to a great degree (if required, I think
the same can be verified by LWLock stats data). One reason for hitting
contention at 88 clients is that this machine seems to have 64-cores (it
has 8 sockets and 8 Core(s) per socket) as per below information of lscpu
command.

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
Stepping: 2
CPU MHz: 1064.000
BogoMIPS: 4266.62
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#17Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Amit Kapila (#16)

On Thu, Mar 17, 2016 at 9:00 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

If you see, for the Base readings, there is a performance increase up till

64 clients and then there is a fall at 88 clients, which to me >indicates
that it hits very high-contention around CLogControlLock at 88 clients
which CLog patch is able to control to a great degree (if >required, I
think the same can be verified by LWLock stats data). One reason for
hitting contention at 88 clients is that this machine >seems to have
64-cores (it has 8 sockets and 8 Core(s) per socket) as per below
information of lscpu command.

I am attaching LWLock stats data for above test setups(unlogged tables).

*BASE* At 64 clients Block-time unit At 88 clients Block-time unit
ProcArrayLock 182946 117827
ClogControlLock 107420 120266
*clog patch*

ProcArrayLock 183663 121215
ClogControlLock 72806 65220
*clog patch + save snapshot*

ProcArrayLock 128260 83356
ClogControlLock 78921 74011
This is for unlogged lables, I mainly see ProcArrayLock have higher
contention at 64 clients and at 88 contention is slightly moved to other
entities.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

lwlock details.odtapplication/vnd.oasis.opendocument.text; name="lwlock details.odt"Download
#18Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Mithun Cy (#17)

On Thu, Mar 10, 2016 at 1:36 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Do you think it makes sense to check the performance by increasing CLOG

buffers (patch for same is posted in Speed up Clog thread [1]) >as that
also relieves contention on CLOG as per the tests I have done?

Along with clog patches and save snapshot, I also applied Amit's increase
clog buffer patch. Below is the benchmark results.

clients BASE + WITH INCREASE IN CLOG BUFFER Patch % Increase
1 1198.326337 1275.461955 6.4369458985
32 37455.181727 41239.13593 10.102618726
64 48838.016451 56362.163626 15.4063324471
88 36878.187766 58217.924138 57.8654691695
128 35901.537773 58239.783982 62.22086182
256 28130.354402 56417.133939 100.5560723934

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#19Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#18)

On Tue, Mar 22, 2016 at 4:42 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

On Thu, Mar 10, 2016 at 1:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Do you think it makes sense to check the performance by increasing CLOG buffers (patch for same is posted in Speed up Clog thread [1]) >as that also relieves contention on CLOG as per the tests I have done?

Along with clog patches and save snapshot, I also applied Amit's increase clog buffer patch. Below is the benchmark results.

I think that we really shouldn't do anything about this patch until
after the CLOG stuff is settled, which it isn't yet. So I'm going to
mark this Returned with Feedback; let's reconsider it for 9.7.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Robert Haas (#19)

On Fri, Apr 8, 2016 at 12:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think that we really shouldn't do anything about this patch until
after the CLOG stuff is settled, which it isn't yet. So I'm going to
mark this Returned with Feedback; let's reconsider it for 9.7.

I am updating a rebased patch have tried to benchmark again could see
good improvement in the pgbench read-only case at very high clients on
our cthulhu (8 nodes, 128 hyper thread machines) and power2 (4 nodes,
192 hyper threads) machine. There is some issue with base code
benchmarking which is somehow not consistent so once I could figure
out what is the issue with that I will update

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Cache_data_in_GetSnapshotData_POC_02.patchapplication/octet-stream; name=Cache_data_in_GetSnapshotData_POC_02.patchDownload+140-10
#21Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Mithun Cy (#20)
#22Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Mithun Cy (#21)
#23Andres Freund
andres@anarazel.de
In reply to: Mithun Cy (#22)
#24Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Andres Freund (#23)
#25Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Andres Freund (#23)
#26Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Mithun Cy (#25)
#27Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Andres Freund (#23)
#28Jesper Pedersen
jesper.pedersen@redhat.com
In reply to: Mithun Cy (#27)
#29Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Jesper Pedersen (#28)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#25)
#31Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#30)
#32Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Robert Haas (#30)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#32)
#34Mithun Cy
mithun.cy@enterprisedb.com
In reply to: Robert Haas (#33)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Mithun Cy (#34)
#36Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#35)
#37Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#36)