Should we cacheline align PGXACT?
Hackers,
originally this idea was proposed by Andres Freund while experimenting with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores
machine. As before it shows huge benefit in this case.
For sure, we should validate that it doesn't cause performance regression
in other cases. At least we should test read-write and smaller machines.
Any other ideas?
1.
/messages/by-id/20160411214029.ce3fw6zxim5k6a2r@alap3.anarazel.de
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Fri, Aug 19, 2016 at 11:42 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
Hackers,
originally this idea was proposed by Andres Freund while experimenting with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores machine.
As before it shows huge benefit in this case.
For sure, we should validate that it doesn't cause performance regression in
other cases. At least we should test read-write and smaller machines.
Any other ideas?
may be test on Power m/c as well.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
originally this idea was proposed by Andres Freund while experimenting with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores
machine. As before it shows huge benefit in this case.
That's one mighty ugly patch. Can't you do it without needing to
introduce the additional layer of struct nesting?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 19, 2016 at 4:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
originally this idea was proposed by Andres Freund while experimenting
with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores
machine. As before it shows huge benefit in this case.That's one mighty ugly patch. Can't you do it without needing to
introduce the additional layer of struct nesting?
That's worrying me too.
We could use anonymous struct, but it seems to be prohibited in C89 which
we stick to.
Another idea, which comes to my mind, is to manually calculate size of
padding and insert it directly to PGXACT struct. But that seems rather
ugly too. However, it would be ugly definition not ugly usage...
Do you have better ideas?
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Fri, Aug 19, 2016 at 3:19 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Fri, Aug 19, 2016 at 11:42 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:Hackers,
originally this idea was proposed by Andres Freund while experimenting
with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-coresmachine.
As before it shows huge benefit in this case.
For sure, we should validate that it doesn't cause performanceregression in
other cases. At least we should test read-write and smaller machines.
Any other ideas?may be test on Power m/c as well.
Good idea. I don't have any such machine at hand now. Do you have one?
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
On Fri, Aug 19, 2016 at 4:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
That's one mighty ugly patch. Can't you do it without needing to
introduce the additional layer of struct nesting?
That's worrying me too.
We could use anonymous struct, but it seems to be prohibited in C89 which
we stick to.
Another idea, which comes to my mind, is to manually calculate size of
padding and insert it directly to PGXACT struct. But that seems rather
ugly too. However, it would be ugly definition not ugly usage...
Do you have better ideas?
No, that was the best one that had occurred to me, too. You could
probably introduce a StaticAssert that sizeof(PGXACT) is a power of 2
as a means of checking that the manual padding calculation hadn't
gotten broken.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 19, 2016 at 8:24 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
On Fri, Aug 19, 2016 at 3:19 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:On Fri, Aug 19, 2016 at 11:42 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:Hackers,
originally this idea was proposed by Andres Freund while experimenting
with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores
machine.
As before it shows huge benefit in this case.
For sure, we should validate that it doesn't cause performance
regression in
other cases. At least we should test read-write and smaller machines.
Any other ideas?may be test on Power m/c as well.
Good idea. I don't have any such machine at hand now. Do you have one?
Yes, I can help in testing this patch during CF. Feel free to ping,
if I missed or forgot to do same.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Aug 19, 2016, at 2:12 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
Hackers,
originally this idea was proposed by Andres Freund while experimenting with lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores machine. As before it shows huge benefit in this case.
For sure, we should validate that it doesn't cause performance regression in other cases. At least we should test read-write and smaller machines.
Any other ideas?
Wow, nice results. My intuition on why PGXACT helped in the first place was that it minimized the number of cache lines that had to be touched to take a snapshot. Padding obviously would somewhat increase that again, so I can't quite understand why it seems to be helping... any idea?
...Robert
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
Wow, nice results. My intuition on why PGXACT helped in the first place was that it minimized the number of cache lines that had to be touched to take a snapshot. Padding obviously would somewhat increase that again, so I can't quite understand why it seems to be helping... any idea?
That's an interesting point. I wonder whether this whole thing will be
useless or even counterproductive after (if) Heikki's CSN-snapshot patch
gets in. I would certainly not mind reverting the PGXACT/PGPROC
separation if it proves no longer helpful after that.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-08-20 14:33:13 -0400, Robert Haas wrote:
On Aug 19, 2016, at 2:12 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
Hackers,
originally this idea was proposed by Andres Freund while experimenting with lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores machine. As before it shows huge benefit in this case.
For sure, we should validate that it doesn't cause performance regression in other cases. At least we should test read-write and smaller machines.
Any other ideas?Wow, nice results. My intuition on why PGXACT helped in the first
place was that it minimized the number of cache lines that had to be
touched to take a snapshot. Padding obviously would somewhat increase
that again, so I can't quite understand why it seems to be
helping... any idea?
I don't think it's that surprising: PGXACT->xid is written to each
transaction, and ->xmin is often written to multiple times per
transaction. That means that if a PGXACT's cacheline is shared between
backends one write will often first have another CPU flush it's store
buffer / L1 / L2 cache. If there's several hops between two cores, that
can mean quite a bit of added latency. I previously played around with
*removing* the optimization of resetting ->xmin when not required
anymore - and on a bigger machine it noticeably increased throughput on
higher client counts.
To me it's pretty clear that rounding up PGXACT's size to a 16 bytes
(instead of the current 12, with 4 byte alignment) is going to be a win,
the current approach just leeds to pointless sharing. Besides, storing
the database oid in there will allow GetOldestXmin() to only use PGXACT,
and could, with a bit more work, allow to ignore other databases in
GetSnapshotData().
I'm less sure that going up to a full cacheline is always a win.
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-08-19 09:46:12 -0400, Tom Lane wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
originally this idea was proposed by Andres Freund while experimenting with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores
machine. As before it shows huge benefit in this case.That's one mighty ugly patch.
My version of it was only intended to nail down some variability on the
pgpro machine, it wasn't intended for submission.
Can't you do it without needing to introduce the additional layer of
struct nesting?
If we required support for anonymous unions, such things would be a lot
easier to do. That aside, the only alternative seems tob e hard-coding
padding space - which probably isn't all that un-fragile either.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 22 August 2016 at 10:40, Andres Freund <andres@anarazel.de> wrote:
On 2016-08-19 09:46:12 -0400, Tom Lane wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
originally this idea was proposed by Andres Freund while experimenting
with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores
machine. As before it shows huge benefit in this case.That's one mighty ugly patch.
My version of it was only intended to nail down some variability on the
pgpro machine, it wasn't intended for submission.Can't you do it without needing to introduce the additional layer of
struct nesting?If we required support for anonymous unions, such things would be a lot
easier to do. That aside, the only alternative seems tob e hard-coding
padding space - which probably isn't all that un-fragile either.
<http://www.postgresql.org/mailpref/pgsql-hackers>
Somewhat naïve question from someone with much less clue about low level
cache behaviour trying to follow along: given that we determine such
padding at compile time, how do we ensure that the cacheline size we're
targeting is right at runtime? Or is it safe to assume that using 16 bytes
so we don't cross cache line boundaries is always helpful, whether we have
4 PGXACT entries (64 byte line) or some other number per cacheline?
Also, for anyone else following this discussion who wants to understand it
better, take a look at
http://igoro.com/archive/gallery-of-processor-cache-effects/ .
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2016-08-22 11:25:55 +0800, Craig Ringer wrote:
On 22 August 2016 at 10:40, Andres Freund <andres@anarazel.de> wrote:
On 2016-08-19 09:46:12 -0400, Tom Lane wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
originally this idea was proposed by Andres Freund while experimenting
with
lockfree Pin/UnpinBuffer [1].
The patch is attached as well as results of pgbench -S on 72-cores
machine. As before it shows huge benefit in this case.That's one mighty ugly patch.
My version of it was only intended to nail down some variability on the
pgpro machine, it wasn't intended for submission.Can't you do it without needing to introduce the additional layer of
struct nesting?If we required support for anonymous unions, such things would be a lot
easier to do. That aside, the only alternative seems tob e hard-coding
padding space - which probably isn't all that un-fragile either.
<http://www.postgresql.org/mailpref/pgsql-hackers>Somewhat na�ve question from someone with much less clue about low level
cache behaviour trying to follow along: given that we determine such
padding at compile time, how do we ensure that the cacheline size we're
targeting is right at runtime?
There's basically only very few common cacheline sizes. Pretty much only
64 byte and 128 bytes are common these days. By usually padding to the
larger of those two, we waste a bit of memory, but not actually cache
space on platforms with smaller lines, because the padding is never
accessed.
Or is it safe to assume that using 16 bytes so we don't cross cache
line boundaries is always helpful, whether we have 4 PGXACT entries
(64 byte line) or some other number per cacheline?
That's generally a good thing to do, yes. It's probably not going to
give the full benefits here though.
Regards
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Aug 20, 2016 at 9:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Wow, nice results. My intuition on why PGXACT helped in the first place
was that it minimized the number of cache lines that had to be touched to
take a snapshot. Padding obviously would somewhat increase that again, so I
can't quite understand why it seems to be helping... any idea?That's an interesting point. I wonder whether this whole thing will be
useless or even counterproductive after (if) Heikki's CSN-snapshot patch
gets in. I would certainly not mind reverting the PGXACT/PGPROC
separation if it proves no longer helpful after that.
Assuming, we wouldn't realistically have CSN-snapshot patch committed to
10, I think we should consider PGXACT cache line alignment patch for 10.
Revision on PGXACT align patch is attached. Now it doesn't introduce new
data structure for alignment, but uses manually calculated padding. I
added static assertion that PGXACT is exactly PG_CACHE_LINE_SIZE because it
still have plenty of room for new fields before PG_CACHE_LINE_SIZE would be
exceeded.
Readonly pgbench results on 72 physically cores Intel server are attached.
It quite similar to results I posted before, but it's curious that
performance degradation of master on high concurrency became larger.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi,
As discussed at the Developer meeting ~ a week ago, I've ran a number of
benchmarks on the commit, on a small/medium-size x86 machines. I
currently don't have access to a machine as big as used by Alexander
(with 72 physical cores), but it seems useful to verify the patch does
not have negative impact on smaller machines.
In particular I've ran these tests:
* r/o pgbench
* r/w pgbench
* 90% reads, 10% writes
* pgbench with skewed distribution
* pgbench with skewed distribution and skipping
And each of that with a number of clients, depending on the number of
cores available. I've used the usual two boxes I use for all benchmarks,
i.e. a small i5-2500k machine (8GB RAM, 4 cores), and a medium e5-2620v4
box (32GB RAM, 16/32 cores).
Comparing averages of tps, measured on 5 runs (each 5 minutes long), the
difference between master and patched master is usually within 2%, which
is pretty much within noise.
I'm attaching spreadsheets with summary of the results, so that we have
it in the archives. As usual, the scripts and much more detailed results
are available here:
* e5-2620: https://bitbucket.org/tvondra/test-xact-alignment
* i5-2500k: https://bitbucket.org/tvondra/test-xact-alignment-i5
I do plan to run these results on the Power8 box I have access to, but
that will have to wait for a bit, because it's currently doing something
else.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
I am currently testing this patch on a large machine and will share the
test results in few days of time.
Please excuse any grammatical errors as I am using my mobile device. Thanks.
On Feb 11, 2017 04:59, "Tomas Vondra" <tomas.vondra@2ndquadrant.com> wrote:
Show quoted text
Hi,
As discussed at the Developer meeting ~ a week ago, I've ran a number of
benchmarks on the commit, on a small/medium-size x86 machines. I currently
don't have access to a machine as big as used by Alexander (with 72
physical cores), but it seems useful to verify the patch does not have
negative impact on smaller machines.In particular I've ran these tests:
* r/o pgbench
* r/w pgbench
* 90% reads, 10% writes
* pgbench with skewed distribution
* pgbench with skewed distribution and skippingAnd each of that with a number of clients, depending on the number of
cores available. I've used the usual two boxes I use for all benchmarks,
i.e. a small i5-2500k machine (8GB RAM, 4 cores), and a medium e5-2620v4
box (32GB RAM, 16/32 cores).Comparing averages of tps, measured on 5 runs (each 5 minutes long), the
difference between master and patched master is usually within 2%, which is
pretty much within noise.I'm attaching spreadsheets with summary of the results, so that we have it
in the archives. As usual, the scripts and much more detailed results are
available here:* e5-2620: https://bitbucket.org/tvondra/test-xact-alignment
* i5-2500k: https://bitbucket.org/tvondra/test-xact-alignment-i5I do plan to run these results on the Power8 box I have access to, but
that will have to wait for a bit, because it's currently doing something
else.regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/11/2017 02:44 AM, Ashutosh Sharma wrote:
Hi,
I am currently testing this patch on a large machine and will share the
test results in few days of time.
FWIW it might be interesting to have comparable results from the same
benchmarks I did. The scripts available in the git repositories should
not be that hard to tweak. Let me know if you're interested and need
help with that.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi, Tomas!
On Sat, Feb 11, 2017 at 2:28 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:
As discussed at the Developer meeting ~ a week ago, I've ran a number of
benchmarks on the commit, on a small/medium-size x86 machines. I currently
don't have access to a machine as big as used by Alexander (with 72
physical cores), but it seems useful to verify the patch does not have
negative impact on smaller machines.In particular I've ran these tests:
* r/o pgbench
* r/w pgbench
* 90% reads, 10% writes
* pgbench with skewed distribution
* pgbench with skewed distribution and skipping
Thank you very much for your efforts!
I took a look at these tests. One thing catch my eyes. You warmup
database using pgbench run. Did you consider using pg_prewarm instead?
SELECT sum(x.x) FROM (SELECT pg_prewarm(oid) AS x FROM pg_class WHERE
relkind IN ('i', 'r') ORDER BY oid) x;
In my experience pg_prewarm both takes less time and leaves less variation
afterwards.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 02/11/2017 01:21 PM, Alexander Korotkov wrote:
Hi, Tomas!
On Sat, Feb 11, 2017 at 2:28 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:As discussed at the Developer meeting ~ a week ago, I've ran a
number of benchmarks on the commit, on a small/medium-size x86
machines. I currently don't have access to a machine as big as used
by Alexander (with 72 physical cores), but it seems useful to verify
the patch does not have negative impact on smaller machines.In particular I've ran these tests:
* r/o pgbench
* r/w pgbench
* 90% reads, 10% writes
* pgbench with skewed distribution
* pgbench with skewed distribution and skippingThank you very much for your efforts!
I took a look at these tests. One thing catch my eyes. You warmup
database using pgbench run. Did you consider using pg_prewarm instead?SELECT sum(x.x) FROM (SELECT pg_prewarm(oid) AS x FROM pg_class WHERE
relkind IN ('i', 'r') ORDER BY oid) x;In my experience pg_prewarm both takes less time and leaves less
variation afterwards.
I've considered it, but the problem I see in using pg_prewarm for
benchmarking purposes is that it only loads the data into memory, but it
does not modify the tuples (so all tuples have the same xmin/xmax, no
dead tuples, ...), it does not set usage counters on the buffers and
also does not generate any clog records.
I don't think there's a lot of variability in the results I measured. If
you look at (max-min) for each combination of parameters, the delta is
generally within 2% of average, with a very few exceptions, usually
caused by the first run (so perhaps the warmup should be a bit longer).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
FWIW it might be interesting to have comparable results from the same
benchmarks I did. The scripts available in the git repositories should not
be that hard to tweak. Let me know if you're interested and need help with
that.
Sure, I will have a look into those scripts once I am done with the
simple pgbench testing. Thanks and sorry for the delayed response.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers