Performance degradation in commit 6150a1b0
From past few weeks, we were facing some performance degradation in the
read-only performance bench marks in high-end machines. My colleague
Mithun, has tried by reverting commit ac1d794 which seems to degrade the
performance in HEAD on high-end m/c's as reported previously[1]/messages/by-id/CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com, but still
we were getting degradation, then we have done some profiling to see what
has caused it and we found that it's mainly caused by spin lock when
called via pin/unpin buffer and then we tried by reverting commit 6150a1b0
which has recently changed the structures in that area and it turns out
that reverting that patch, we don't see any degradation in performance.
The important point to note is that the performance degradation doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.
m/c details
IBM POWER-8
24 cores,192 hardware threads
RAM - 492GB
Non-default postgresql.conf settings-
shared_buffers=16GB
max_connections=200
min_wal_size=15GB
max_wal_size=20GB
checkpoint_timeout=900
maintenance_work_mem=1GB
checkpoint_completion_target=0.9
scale_factor - 300
Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is 469002 at
64-client count and then at 6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
went down to 200807. This performance numbers are median of 3 15-min
pgbench read-only tests. The similar data is seen even when we revert the
patch on latest commit. We have yet to perform detail analysis as to why
the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to degradation,
but any ideas are welcome.
[1]: /messages/by-id/CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
/messages/by-id/CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 24 February 2016 at 23:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
From past few weeks, we were facing some performance degradation in the
read-only performance bench marks in high-end machines. My colleague
Mithun, has tried by reverting commit ac1d794 which seems to degrade the
performance in HEAD on high-end m/c's as reported previously[1], but still
we were getting degradation, then we have done some profiling to see what
has caused it and we found that it's mainly caused by spin lock when
called via pin/unpin buffer and then we tried by reverting commit 6150a1b0
which has recently changed the structures in that area and it turns out
that reverting that patch, we don't see any degradation in performance.
The important point to note is that the performance degradation doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.
Not seen that on the original patch I posted. 6150a1b0 contains multiple
changes to the lwlock structures, one written by me, others by Andres.
Perhaps we should revert that patch and re-apply the various changes in
multiple commits so we can see the differences.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Feb 25, 2016 at 11:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 24 February 2016 at 23:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
From past few weeks, we were facing some performance degradation in the
read-only performance bench marks in high-end machines. My colleague
Mithun, has tried by reverting commit ac1d794 which seems to degrade the
performance in HEAD on high-end m/c's as reported previously[1], but still
we were getting degradation, then we have done some profiling to see what
has caused it and we found that it's mainly caused by spin lock when
called via pin/unpin buffer and then we tried by reverting commit 6150a1b0
which has recently changed the structures in that area and it turns out
that reverting that patch, we don't see any degradation in performance.
The important point to note is that the performance degradation doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.Not seen that on the original patch I posted. 6150a1b0 contains multiple
changes to the lwlock structures, one written by me, others by Andres.Perhaps we should revert that patch and re-apply the various changes in
multiple commits so we can see the differences.
Yes, thats one choice, other is locally we can narrow down the root cause
of problem and then try to address the same. Last time similar issue came
up on list, agreement [1]/messages/by-id/CA+TgmoYjYqegXzrBizL-Ov7zDsS=GavCnxYnGn9WZ1S=rP8DaA@mail.gmail.com was to note down it in PostgreSQL 9.6 open items
and then work on it. I think for this problem, we haven't got to the root
cause of problem, so we can try to investigate it. If nobody else steps up
to reproduce and look into problem, in few days, I will look into it.
[1]: /messages/by-id/CA+TgmoYjYqegXzrBizL-Ov7zDsS=GavCnxYnGn9WZ1S=rP8DaA@mail.gmail.com
/messages/by-id/CA+TgmoYjYqegXzrBizL-Ov7zDsS=GavCnxYnGn9WZ1S=rP8DaA@mail.gmail.com
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 25 February 2016 at 18:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 25, 2016 at 11:38 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:On 24 February 2016 at 23:26, Amit Kapila <amit.kapila16@gmail.com>
wrote:From past few weeks, we were facing some performance degradation in the
read-only performance bench marks in high-end machines. My colleague
Mithun, has tried by reverting commit ac1d794 which seems to degrade the
performance in HEAD on high-end m/c's as reported previously[1], but still
we were getting degradation, then we have done some profiling to see what
has caused it and we found that it's mainly caused by spin lock when
called via pin/unpin buffer and then we tried by reverting commit 6150a1b0
which has recently changed the structures in that area and it turns out
that reverting that patch, we don't see any degradation in performance.
The important point to note is that the performance degradation doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.Not seen that on the original patch I posted. 6150a1b0 contains multiple
changes to the lwlock structures, one written by me, others by Andres.Perhaps we should revert that patch and re-apply the various changes in
multiple commits so we can see the differences.Yes, thats one choice, other is locally we can narrow down the root cause
of problem and then try to address the same. Last time similar issue came
up on list, agreement [1] was to note down it in PostgreSQL 9.6 open items
and then work on it. I think for this problem, we haven't got to the root
cause of problem, so we can try to investigate it. If nobody else steps up
to reproduce and look into problem, in few days, I will look into it.[1] -
/messages/by-id/CA+TgmoYjYqegXzrBizL-Ov7zDsS=GavCnxYnGn9WZ1S=rP8DaA@mail.gmail.com
Don't understand this. If a problem is caused by one of two things, first
you check one, then the other.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Feb 26, 2016 at 8:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Don't understand this. If a problem is caused by one of two things, first
you check one, then the other.
I don't quite understand how you think that patch can be decomposed
into multiple, independent changes. It was one commit because every
change in there is interdependent with every other one, at least as
far as I can see. I don't really understand how you'd split it up, or
what useful information you'd hope to gain from testing a split patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
From past few weeks, we were facing some performance degradation in the
read-only performance bench marks in high-end machines. My colleague
Mithun, has tried by reverting commit ac1d794 which seems to degrade the
performance in HEAD on high-end m/c's as reported previously[1], but still
we were getting degradation, then we have done some profiling to see what
has caused it and we found that it's mainly caused by spin lock when
called via pin/unpin buffer and then we tried by reverting commit 6150a1b0
which has recently changed the structures in that area and it turns out
that reverting that patch, we don't see any degradation in performance.
The important point to note is that the performance degradation doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.
m/c details
IBM POWER-8
24 cores,192 hardware threads
RAM - 492GBNon-default postgresql.conf settings-
shared_buffers=16GB
max_connections=200
min_wal_size=15GB
max_wal_size=20GB
checkpoint_timeout=900
maintenance_work_mem=1GB
checkpoint_completion_target=0.9scale_factor - 300
Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is 469002 at
64-client count and then at 6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
went down to 200807. This performance numbers are median of 3 15-min
pgbench read-only tests. The similar data is seen even when we revert the
patch on latest commit. We have yet to perform detail analysis as to why
the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to degradation,
but any ideas are welcome.
Ugh. Especially the varying performance is odd. Does it vary between
restarts, or is it just happenstance? If it's the former, we might be
dealing with some alignment issues.
If not, I wonder if the issue is massive buffer header contention. As a
LL/SC architecture acquiring the content lock might interrupt buffer
spinlock acquisition and vice versa.
Does applying the patch from http://archives.postgresql.org/message-id/CAPpHfdu77FUi5eiNb%2BjRPFh5S%2B1U%2B8ax4Zw%3DAUYgt%2BCPsKiyWw%40mail.gmail.com
change the picture?
Regards,
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
From past few weeks, we were facing some performance degradation in the
read-only performance bench marks in high-end machines. My colleague
Mithun, has tried by reverting commit ac1d794 which seems to degrade the
performance in HEAD on high-end m/c's as reported previously[1], but
still
we were getting degradation, then we have done some profiling to see
what
has caused it and we found that it's mainly caused by spin lock when
called via pin/unpin buffer and then we tried by reverting commit
6150a1b0
which has recently changed the structures in that area and it turns out
that reverting that patch, we don't see any degradation in performance.
The important point to note is that the performance degradation doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.m/c details
IBM POWER-8
24 cores,192 hardware threads
RAM - 492GBNon-default postgresql.conf settings-
shared_buffers=16GB
max_connections=200
min_wal_size=15GB
max_wal_size=20GB
checkpoint_timeout=900
maintenance_work_mem=1GB
checkpoint_completion_target=0.9scale_factor - 300
Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is
469002 at
64-client count and then at 6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
went down to 200807. This performance numbers are median of 3 15-min
pgbench read-only tests. The similar data is seen even when we revert
the
patch on latest commit. We have yet to perform detail analysis as to
why
the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to degradation,
but any ideas are welcome.Ugh. Especially the varying performance is odd. Does it vary between
restarts, or is it just happenstance? If it's the former, we might be
dealing with some alignment issues.
It varies between restarts.
If not, I wonder if the issue is massive buffer header contention. As a
LL/SC architecture acquiring the content lock might interrupt buffer
spinlock acquisition and vice versa.Does applying the patch from
change the picture?
Not tried, but if this is alignment issue as you are suspecting above, then
does it make sense to try this out?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On February 26, 2016 7:55:18 PM PST, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund <andres@anarazel.de>
wrote:Hi,
On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
From past few weeks, we were facing some performance degradation in
the
read-only performance bench marks in high-end machines. My
colleague
Mithun, has tried by reverting commit ac1d794 which seems to
degrade the
performance in HEAD on high-end m/c's as reported previously[1],
but
stillwe were getting degradation, then we have done some profiling to
see
whathas caused it and we found that it's mainly caused by spin lock
when
called via pin/unpin buffer and then we tried by reverting commit
6150a1b0
which has recently changed the structures in that area and it turns
out
that reverting that patch, we don't see any degradation in
performance.
The important point to note is that the performance degradation
doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.m/c details
IBM POWER-8
24 cores,192 hardware threads
RAM - 492GBNon-default postgresql.conf settings-
shared_buffers=16GB
max_connections=200
min_wal_size=15GB
max_wal_size=20GB
checkpoint_timeout=900
maintenance_work_mem=1GB
checkpoint_completion_target=0.9scale_factor - 300
Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is
469002 at
64-client count and then at
6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
went down to 200807. This performance numbers are median of 3
15-min
pgbench read-only tests. The similar data is seen even when we
revert
thepatch on latest commit. We have yet to perform detail analysis as
to
whythe commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to
degradation,
but any ideas are welcome.
Ugh. Especially the varying performance is odd. Does it vary between
restarts, or is it just happenstance? If it's the former, we mightbe
dealing with some alignment issues.
It varies between restarts.
If not, I wonder if the issue is massive buffer header contention. As
a
LL/SC architecture acquiring the content lock might interrupt buffer
spinlock acquisition and vice versa.Does applying the patch from
change the picture?
Not tried, but if this is alignment issue as you are suspecting above,
then
does it make sense to try this out?
It's the other theory I had. And it's additionally useful testing regardless of this regression...
---
Please excuse brevity and formatting - I am writing this on my mobile phone.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi All,
I have been working on this issue for last few days trying to investigate
what could be the probable reasons for Performance degradation at commit
6150a1b0. After going through Andres patch for moving buffer I/O and
content lock out of Main Tranche, the following two things come into my
mind.
1. Content Lock is no more used as a pointer in BufferDesc structure
instead it is included as LWLock structure. This basically increases the
overall structure size from 64bytes to 80 bytes. Just to investigate on
this, I have reverted the changes related to content lock from commit
6150a1b0 and taken at least 10 readings and with this change i can see that
the overall performance is similar to what it was observed earlier i.e.
before commit 6150a1b0.
2. Secondly, i can see that the BufferDesc structure padding is 64 bytes
however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the
BufferDesc structure padding size to 128 bytes along with the changes
mentioned in above point #1, I see that the overall performance is again
similar to what is observed before commit 6150a1b0.
Please have a look into the attached test report that contains the
performance test results for all the scenarios discussed above and let me
know your thoughts.
With Regards,
Ashutosh Sharma
EnterpriseDB: *http://www.enterprisedb.com <http://www.enterprisedb.com>*
On Sat, Feb 27, 2016 at 9:26 AM, Andres Freund <andres@anarazel.de> wrote:
Show quoted text
On February 26, 2016 7:55:18 PM PST, Amit Kapila <amit.kapila16@gmail.com>
wrote:On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund <andres@anarazel.de>
wrote:Hi,
On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
From past few weeks, we were facing some performance degradation in
the
read-only performance bench marks in high-end machines. My
colleague
Mithun, has tried by reverting commit ac1d794 which seems to
degrade the
performance in HEAD on high-end m/c's as reported previously[1],
but
stillwe were getting degradation, then we have done some profiling to
see
whathas caused it and we found that it's mainly caused by spin lock
when
called via pin/unpin buffer and then we tried by reverting commit
6150a1b0
which has recently changed the structures in that area and it turns
out
that reverting that patch, we don't see any degradation in
performance.
The important point to note is that the performance degradation
doesn't
occur every time, but if the tests are repeated twice or thrice, it
is easily visible.m/c details
IBM POWER-8
24 cores,192 hardware threads
RAM - 492GBNon-default postgresql.conf settings-
shared_buffers=16GB
max_connections=200
min_wal_size=15GB
max_wal_size=20GB
checkpoint_timeout=900
maintenance_work_mem=1GB
checkpoint_completion_target=0.9scale_factor - 300
Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is
469002 at
64-client count and then at
6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
went down to 200807. This performance numbers are median of 3
15-min
pgbench read-only tests. The similar data is seen even when we
revert
thepatch on latest commit. We have yet to perform detail analysis as
to
whythe commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to
degradation,
but any ideas are welcome.
Ugh. Especially the varying performance is odd. Does it vary between
restarts, or is it just happenstance? If it's the former, we mightbe
dealing with some alignment issues.
It varies between restarts.
If not, I wonder if the issue is massive buffer header contention. As
a
LL/SC architecture acquiring the content lock might interrupt buffer
spinlock acquisition and vice versa.Does applying the patch from
change the picture?
Not tried, but if this is alignment issue as you are suspecting above,
then
does it make sense to try this out?It's the other theory I had. And it's additionally useful testing
regardless of this regression...---
Please excuse brevity and formatting - I am writing this on my mobile
phone.--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attachments:
On Wed, Mar 23, 2016 at 1:59 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:
Hi All,
I have been working on this issue for last few days trying to investigate
what could be the probable reasons for Performance degradation at commit
6150a1b0. After going through Andres patch for moving buffer I/O and
content lock out of Main Tranche, the following two things come into my
mind.
1. Content Lock is no more used as a pointer in BufferDesc structure
instead it is included as LWLock structure. This basically increases the
overall structure size from 64bytes to 80 bytes. Just to investigate on
this, I have reverted the changes related to content lock from commit
6150a1b0 and taken at least 10 readings and with this change i can see that
the overall performance is similar to what it was observed earlier i.e.
before commit 6150a1b0.
2. Secondly, i can see that the BufferDesc structure padding is 64 bytes
however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the
BufferDesc structure padding size to 128 bytes along with the changes
mentioned in above point #1, I see that the overall performance is again
similar to what is observed before commit 6150a1b0.
Please have a look into the attached test report that contains the
performance test results for all the scenarios discussed above and let me
know your thoughts.
So this indicates that changing back content lock as LWLock* in BufferDesc
brings back the performance which indicates that increase in BufferDesc
size to more than 64bytes on this platform has caused regression. I think
it is worth trying the patch [1]/messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com as suggested by Andres as that will reduce
the size of BufferDesc which can bring back the performance. Can you once
try the same?
[1]: /messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com
/messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 2016-03-25 09:29:34 +0530, Amit Kapila wrote:
2. Secondly, i can see that the BufferDesc structure padding is 64 bytes
however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the
BufferDesc structure padding size to 128 bytes along with the changes
mentioned in above point #1, I see that the overall performance is again
similar to what is observed before commit 6150a1b0.
That makes sense, as it restores alignment.
So this indicates that changing back content lock as LWLock* in BufferDesc
brings back the performance which indicates that increase in BufferDesc
size to more than 64bytes on this platform has caused regression. I think
it is worth trying the patch [1] as suggested by Andres as that will reduce
the size of BufferDesc which can bring back the performance. Can you once
try the same?[1] -
/messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com
Yes please. I'll try to review that once more ASAP.
Regards,
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I am getting some reject files while trying to apply "*pinunpin-cas-5.patch*"
attached with the thread,
*/messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com
</messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com>*
Note: I am applying this patch on top of commit "
*6150a1b08a9fe7ead2b25240be46dddeae9d98e1*".
With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com
On Fri, Mar 25, 2016 at 9:29 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
Show quoted text
On Wed, Mar 23, 2016 at 1:59 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:Hi All,
I have been working on this issue for last few days trying to
investigate what could be the probable reasons for Performance degradation
at commit 6150a1b0. After going through Andres patch for moving buffer I/O
and content lock out of Main Tranche, the following two things come into mymind.
1. Content Lock is no more used as a pointer in BufferDesc structure
instead it is included as LWLock structure. This basically increases the
overall structure size from 64bytes to 80 bytes. Just to investigate on
this, I have reverted the changes related to content lock from commit
6150a1b0 and taken at least 10 readings and with this change i can see that
the overall performance is similar to what it was observed earlier i.e.
before commit 6150a1b0.2. Secondly, i can see that the BufferDesc structure padding is 64 bytes
however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the
BufferDesc structure padding size to 128 bytes along with the changes
mentioned in above point #1, I see that the overall performance is again
similar to what is observed before commit 6150a1b0.Please have a look into the attached test report that contains the
performance test results for all the scenarios discussed above and let me
know your thoughts.So this indicates that changing back content lock as LWLock* in BufferDesc
brings back the performance which indicates that increase in BufferDesc
size to more than 64bytes on this platform has caused regression. I think
it is worth trying the patch [1] as suggested by Andres as that will reduce
the size of BufferDesc which can bring back the performance. Can you once
try the same?[1] -
/messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.comWith Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Hi,
As mentioned in my earlier mail i was not able to apply
*pinunpin-cas-5.patch* on commit *6150a1b0, *therefore i thought of
applying it on the
latest commit and i was able to do it successfully. I have now taken the
performance readings at latest commit i.e. *76281aa9* with and without
applying *pinunpin-cas-5.patch* and my observations are as follows,
1. I can still see that the current performance lags by 2-3% from the
expected performance when *pinunpin-cas-5.patch *is applied on the commit
*76281aa9.*
2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
commit *76281aa9 *the overall performance lags by 50-60% from the expected
performance.
*Note:* Here, the expected performance is the performance observed before
commit *6150a1b0 *when* ac1d794 *is reverted.
Please refer to the attached performance report sheet for more insights.
With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>
On Sat, Mar 26, 2016 at 9:31 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:
Show quoted text
Hi,
I am getting some reject files while trying to apply "
*pinunpin-cas-5.patch*" attached with the thread,*/messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com
</messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com>*
Note: I am applying this patch on top of commit "
*6150a1b08a9fe7ead2b25240be46dddeae9d98e1*".With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.comOn Fri, Mar 25, 2016 at 9:29 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:On Wed, Mar 23, 2016 at 1:59 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:Hi All,
I have been working on this issue for last few days trying to
investigate what could be the probable reasons for Performance degradation
at commit 6150a1b0. After going through Andres patch for moving buffer I/O
and content lock out of Main Tranche, the following two things come into mymind.
1. Content Lock is no more used as a pointer in BufferDesc structure
instead it is included as LWLock structure. This basically increases the
overall structure size from 64bytes to 80 bytes. Just to investigate on
this, I have reverted the changes related to content lock from commit
6150a1b0 and taken at least 10 readings and with this change i can see that
the overall performance is similar to what it was observed earlier i.e.
before commit 6150a1b0.2. Secondly, i can see that the BufferDesc structure padding is 64
bytes however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after
changing the BufferDesc structure padding size to 128 bytes along with the
changes mentioned in above point #1, I see that the overall performance is
again similar to what is observed before commit 6150a1b0.Please have a look into the attached test report that contains the
performance test results for all the scenarios discussed above and let me
know your thoughts.So this indicates that changing back content lock as LWLock* in
BufferDesc brings back the performance which indicates that increase in
BufferDesc size to more than 64bytes on this platform has caused
regression. I think it is worth trying the patch [1] as suggested by
Andres as that will reduce the size of BufferDesc which can bring back the
performance. Can you once try the same?[1] -
/messages/by-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.comWith Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachments:
Hi,
On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
As mentioned in my earlier mail i was not able to apply
*pinunpin-cas-5.patch* on commit *6150a1b0,
That's not surprising; that's pretty old.
*therefore i thought of applying it on the latest commit and i was
able to do it successfully. I have now taken the performance readings
at latest commit i.e. *76281aa9* with and without applying
*pinunpin-cas-5.patch* and my observations are as follows,
1. I can still see that the current performance lags by 2-3% from the
expected performance when *pinunpin-cas-5.patch *is applied on the commit*76281aa9.*
2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
commit *76281aa9 *the overall performance lags by 50-60% from the expected
performance.*Note:* Here, the expected performance is the performance observed before
commit *6150a1b0 *when* ac1d794 *is reverted.
Thanks for doing these benchmarks. What's the performance if you revert
6150a1b0 on top of a recent master? There've been a lot of other patches
influencing performance since 6150a1b0, so minor performance differences
aren't necessarily meaningful; especially when that older version then
had other patches reverted.
Thanks,
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I am unable to revert 6150a1b0 on top of recent commit in the master
branch. It seems like there has been some commit made recently that has got
dependency on 6150a1b0.
With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com
On Sun, Mar 27, 2016 at 5:45 PM, Andres Freund <andres@anarazel.de> wrote:
Show quoted text
Hi,
On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
As mentioned in my earlier mail i was not able to apply
*pinunpin-cas-5.patch* on commit *6150a1b0,That's not surprising; that's pretty old.
*therefore i thought of applying it on the latest commit and i was
able to do it successfully. I have now taken the performance readings
at latest commit i.e. *76281aa9* with and without applying
*pinunpin-cas-5.patch* and my observations are as follows,1. I can still see that the current performance lags by 2-3% from the
expected performance when *pinunpin-cas-5.patch *is applied on the commit*76281aa9.*
2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
commit *76281aa9 *the overall performance lags by 50-60% from theexpected
performance.
*Note:* Here, the expected performance is the performance observed before
commit *6150a1b0 *when* ac1d794 *is reverted.Thanks for doing these benchmarks. What's the performance if you revert
6150a1b0 on top of a recent master? There've been a lot of other patches
influencing performance since 6150a1b0, so minor performance differences
aren't necessarily meaningful; especially when that older version then
had other patches reverted.Thanks,
Andres
On Sun, Mar 27, 2016 at 02:15:50PM +0200, Andres Freund wrote:
On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
As mentioned in my earlier mail i was not able to apply
*pinunpin-cas-5.patch* on commit *6150a1b0,That's not surprising; that's pretty old.
*therefore i thought of applying it on the latest commit and i was
able to do it successfully. I have now taken the performance readings
at latest commit i.e. *76281aa9* with and without applying
*pinunpin-cas-5.patch* and my observations are as follows,1. I can still see that the current performance lags by 2-3% from the
expected performance when *pinunpin-cas-5.patch *is applied on the commit*76281aa9.*
2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
commit *76281aa9 *the overall performance lags by 50-60% from the expected
performance.*Note:* Here, the expected performance is the performance observed before
commit *6150a1b0 *when* ac1d794 *is reverted.Thanks for doing these benchmarks. What's the performance if you revert
6150a1b0 on top of a recent master? There've been a lot of other patches
influencing performance since 6150a1b0, so minor performance differences
aren't necessarily meaningful; especially when that older version then
had other patches reverted.
[This is a generic notification.]
The above-described topic is currently a PostgreSQL 9.6 open item. Andres,
since you committed the patch believed to have created it, you own this open
item. If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this. Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution. Please
present, within 72 hours, a plan to fix the defect within seven days of this
message. Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 31, 2016 at 01:10:56AM -0400, Noah Misch wrote:
On Sun, Mar 27, 2016 at 02:15:50PM +0200, Andres Freund wrote:
On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
As mentioned in my earlier mail i was not able to apply
*pinunpin-cas-5.patch* on commit *6150a1b0,That's not surprising; that's pretty old.
*therefore i thought of applying it on the latest commit and i was
able to do it successfully. I have now taken the performance readings
at latest commit i.e. *76281aa9* with and without applying
*pinunpin-cas-5.patch* and my observations are as follows,1. I can still see that the current performance lags by 2-3% from the
expected performance when *pinunpin-cas-5.patch *is applied on the commit*76281aa9.*
2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
commit *76281aa9 *the overall performance lags by 50-60% from the expected
performance.*Note:* Here, the expected performance is the performance observed before
commit *6150a1b0 *when* ac1d794 *is reverted.Thanks for doing these benchmarks. What's the performance if you revert
6150a1b0 on top of a recent master? There've been a lot of other patches
influencing performance since 6150a1b0, so minor performance differences
aren't necessarily meaningful; especially when that older version then
had other patches reverted.[This is a generic notification.]
The above-described topic is currently a PostgreSQL 9.6 open item. Andres,
since you committed the patch believed to have created it, you own this open
item. If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this. Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution. Please
present, within 72 hours, a plan to fix the defect within seven days of this
message. Thanks.
My attribution above was incorrect. Robert Haas is the committer and owner of
this one. I apologize.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On March 31, 2016 7:16:33 AM GMT+02:00, Noah Misch <noah@leadboat.com> wrote:
On Thu, Mar 31, 2016 at 01:10:56AM -0400, Noah Misch wrote:
On Sun, Mar 27, 2016 at 02:15:50PM +0200, Andres Freund wrote:
On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
As mentioned in my earlier mail i was not able to apply
*pinunpin-cas-5.patch* on commit *6150a1b0,That's not surprising; that's pretty old.
*therefore i thought of applying it on the latest commit and i
was
able to do it successfully. I have now taken the performance
readings
at latest commit i.e. *76281aa9* with and without applying
*pinunpin-cas-5.patch* and my observations are as follows,1. I can still see that the current performance lags by 2-3% from
the
expected performance when *pinunpin-cas-5.patch *is applied on
the commit
*76281aa9.*
2. When *pinunpin-cas-5.patch *is ignored and performance ismeasured at
commit *76281aa9 *the overall performance lags by 50-60% from the
expected
performance.
*Note:* Here, the expected performance is the performance
observed before
commit *6150a1b0 *when* ac1d794 *is reverted.
Thanks for doing these benchmarks. What's the performance if you
revert
6150a1b0 on top of a recent master? There've been a lot of other
patches
influencing performance since 6150a1b0, so minor performance
differences
aren't necessarily meaningful; especially when that older version
then
had other patches reverted.
[This is a generic notification.]
The above-described topic is currently a PostgreSQL 9.6 open item.
Andres,
since you committed the patch believed to have created it, you own
this open
item. If that responsibility lies elsewhere, please let us know
whose
responsibility it is to fix this. Since new open items may be
discovered at
any time and I want to plan to have them all fixed well in advance of
the ship
date, I will appreciate your efforts toward speedy resolution.
Please
present, within 72 hours, a plan to fix the defect within seven days
of this
message. Thanks.
My attribution above was incorrect. Robert Haas is the committer and
owner of
this one. I apologize.
Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 31, 2016 at 3:51 AM, Andres Freund <andres@anarazel.de> wrote:
My attribution above was incorrect. Robert Haas is the committer and
owner of
this one. I apologize.Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.
To which proposal are you referring?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
On Thu, Mar 31, 2016 at 3:51 AM, Andres Freund <andres@anarazel.de> wrote:
My attribution above was incorrect. Robert Haas is the committer and
owner of
this one. I apologize.Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.
To which proposal are you referring?
1) in /messages/by-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 31, 2016 at 6:45 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
On Thu, Mar 31, 2016 at 3:51 AM, Andres Freund <andres@anarazel.de> wrote:
My attribution above was incorrect. Robert Haas is the committer and
owner of
this one. I apologize.Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.
To which proposal are you referring?
1) in /messages/by-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de
OK. So, Noah, my proposed strategy is to wait and see if Andres can
make that work, and if not, then revisit the issue of what to do.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Mar 31, 2016 at 6:45 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
To which proposal are you referring?
1) in /messages/by-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de
OK. So, Noah, my proposed strategy is to wait and see if Andres can
make that work, and if not, then revisit the issue of what to do.
I thought that proposal had already crashed and burned, on the grounds
that byte-size spinlocks require instructions that many PPC machines
don't have.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 31, 2016 at 10:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Mar 31, 2016 at 6:45 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
To which proposal are you referring?
1) in /messages/by-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de
OK. So, Noah, my proposed strategy is to wait and see if Andres can
make that work, and if not, then revisit the issue of what to do.I thought that proposal had already crashed and burned, on the grounds
that byte-size spinlocks require instructions that many PPC machines
don't have.
So the current status of this issue is:
1. Andres committed a patch (008608b9d51061b1f598c197477b3dc7be9c4a64)
to reduce the size of an LWLock by an amount equal to the size of a
mutex (modulo alignment).
2. Andres also committed a patch
(48354581a49c30f5757c203415aa8412d85b0f70) to remove the spinlock from
a BufferDesc, which also reduces its size, I think, because it
replaces members of types BufFlags (2 bytes), uint8, slock_t, and
unsigned with a single member of type pg_atomic_uint32.
The reason why these changes are relevant is because Andres thought
the observed regression might be related to the BufferDesc growing to
more than 64 bytes on POWER, which in turn could cause buffer
descriptors to get split across cache lines. However, in the
meantime, I did some performance tests on the same machine that Amit
used for testing in the email that started this thread:
/messages/by-id/CA+TgmoZJdA6K7-17K4A48rVB0UPR98HVuaNcfNNLrGsdb1uChg@mail.gmail.com
The upshot of that is that (1) the performance degradation I saw was
significant but smaller than what Amit reported in the OP, and (2) it
looked like the patches Andres gave me to test at the time got
performance back to about the same level we were at before 6150a1b0.
So there's room for optimism that this is fixed, but perhaps some
retesting is in order, since what was committed was, I think, not
identical to what I tested.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 05:36:07PM -0400, Robert Haas wrote:
So the current status of this issue is:
1. Andres committed a patch (008608b9d51061b1f598c197477b3dc7be9c4a64)
to reduce the size of an LWLock by an amount equal to the size of a
mutex (modulo alignment).2. Andres also committed a patch
(48354581a49c30f5757c203415aa8412d85b0f70) to remove the spinlock from
a BufferDesc, which also reduces its size, I think, because it
replaces members of types BufFlags (2 bytes), uint8, slock_t, and
unsigned with a single member of type pg_atomic_uint32.The reason why these changes are relevant is because Andres thought
the observed regression might be related to the BufferDesc growing to
more than 64 bytes on POWER, which in turn could cause buffer
descriptors to get split across cache lines. However, in the
meantime, I did some performance tests on the same machine that Amit
used for testing in the email that started this thread:/messages/by-id/CA+TgmoZJdA6K7-17K4A48rVB0UPR98HVuaNcfNNLrGsdb1uChg@mail.gmail.com
The upshot of that is that (1) the performance degradation I saw was
significant but smaller than what Amit reported in the OP, and (2) it
looked like the patches Andres gave me to test at the time got
performance back to about the same level we were at before 6150a1b0.
So there's room for optimism that this is fixed, but perhaps some
retesting is in order, since what was committed was, I think, not
identical to what I tested.
That sounds like this open item is ready for CLOSE_WAIT status; is it?
If someone does retest this, it would be informative to see how the system
performs with 6150a1b0 reverted. Your testing showed performance of 6150a1b0
alone and of 6150a1b0 plus predecessors of 008608b and 4835458. I don't
recall seeing figures for 008608b + 4835458 - 6150a1b0, though.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 10:30 PM, Noah Misch <noah@leadboat.com> wrote:
That sounds like this open item is ready for CLOSE_WAIT status; is it?
I just retested this on power2. Here are the results. I retested
3fed4174 and 6150a1b0 plus master as of deb71fa9. 5-minute pgbench -S
runs, scale factor 300, with predictable prewarming to minimize
variation, as well as numactl --interleave. Each result is a median
of three.
1 client: 3fed4174 = 13701.014931, 6150a1b0 = 13669.626916, master =
19685.571089
8 clients: 3fed4174 = 126676.357079, 6150a1b0 = 125239.911105, master
= 122940.079404
32 clients: 3fed4174 = 323989.685428, 6150a1b0 = 338638.095126, master
= 333656.861590
64 clients: 3fed4174 = 495434.372578, 6150a1b0 = 457794.475129, master
= 493034.922791
128 clients: 3fed4174 = 376412.090366, 6150a1b0 = 363157.294391,
master = 625498.280370
On this test 8, 32, and 64 clients are coming out about the same as
3fed4174, but 1 client and 128 clients are dramatically improved with
current master. The 1-client result is a lot more surprising than the
128-client result; I don't know what's going on there. But anyway I
don't see a regression here.
So, yes, I would say this should go to CLOSE_WAIT at this point,
unless Amit or somebody else turns up further evidence of a continuing
issue here.
Random points of possible interest:
1. During a 128-client run, top shows about 45% user time, 10% system
time, 45% idle.
2. About 3 minutes into a 128-client run, perf looks like this
(substantially abridged):
3.55% postgres postgres [.] GetSnapshotData
2.15% postgres postgres [.] LWLockAttemptLock
|--32.82%-- LockBuffer
| |--48.59%-- _bt_relandgetbuf
| |--44.07%-- _bt_getbuf
|--29.81%-- ReadBuffer_common
|--23.88%-- GetSnapshotData
|--5.30%-- LockAcquireExtended
2.12% postgres postgres [.] LWLockRelease
2.02% postgres postgres [.] _bt_compare
1.88% postgres postgres [.]
hash_search_with_hash_value
|--47.21%-- BufTableLookup
|--10.93%-- LockAcquireExtended
|--5.43%-- GetPortalByName
|--5.21%-- ReadBuffer_common
|--4.68%-- RelationIdGetRelation
1.87% postgres postgres [.] AllocSetAlloc
1.42% postgres postgres [.] PinBuffer.isra.3
0.96% postgres libc-2.17.so [.] __memcpy_power7
0.89% postgres postgres [.]
UnpinBuffer.constprop.7
0.80% postgres postgres [.] PostgresMain
0.80% postgres postgres [.]
pg_encoding_mbcliplen
0.71% postgres postgres [.] hash_any
0.62% postgres postgres [.] AllocSetFree
0.59% postgres postgres [.] palloc
0.57% postgres libc-2.17.so [.] _int_free
A context-switch profile, somewhat amazingly, shows no context
switches for anything other than waiting on client read, implying that
performance is entirely constrained by memory bandwidth and CPU speed,
not lock contention.
If someone does retest this, it would be informative to see how the system
performs with 6150a1b0 reverted. Your testing showed performance of 6150a1b0
alone and of 6150a1b0 plus predecessors of 008608b and 4835458. I don't
recall seeing figures for 008608b + 4835458 - 6150a1b0, though.
That revert isn't trivial: even what exactly that would mean at this
point is somewhat subjective. I'm also not sure there is much point.
6150a1b08a9fe7ead2b25240be46dddeae9d98e1 was written in such a way
that only platforms with single-byte spinlocks were going to have a
BufferDesc that fits into 64 bytes, which in retrospect was a bit
short-sighted. Because the changes that were made to get it back down
to 64 bytes might also have other performance-relevant consequences,
it's a bit hard to be sure that that was the precise thing that caused
the regression. And of course there was a fury of other commits going
in at the same time, some even on related topics, which further adds
to the difficulty of pinpointing this precisely. All that is a bit
unfortunate in some sense, but I think we're just going to have to
keep moving forward and hope for the best.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 11:40:43PM -0400, Robert Haas wrote:
On Tue, Apr 12, 2016 at 10:30 PM, Noah Misch <noah@leadboat.com> wrote:
That sounds like this open item is ready for CLOSE_WAIT status; is it?
I just retested this on power2.
So, yes, I would say this should go to CLOSE_WAIT at this point,
unless Amit or somebody else turns up further evidence of a continuing
issue here.
Thanks for testing again.
If someone does retest this, it would be informative to see how the system
performs with 6150a1b0 reverted. Your testing showed performance of 6150a1b0
alone and of 6150a1b0 plus predecessors of 008608b and 4835458. I don't
recall seeing figures for 008608b + 4835458 - 6150a1b0, though.That revert isn't trivial: even what exactly that would mean at this
point is somewhat subjective. I'm also not sure there is much point.
6150a1b08a9fe7ead2b25240be46dddeae9d98e1 was written in such a way
that only platforms with single-byte spinlocks were going to have a
BufferDesc that fits into 64 bytes, which in retrospect was a bit
short-sighted. Because the changes that were made to get it back down
to 64 bytes might also have other performance-relevant consequences,
it's a bit hard to be sure that that was the precise thing that caused
the regression. And of course there was a fury of other commits going
in at the same time, some even on related topics, which further adds
to the difficulty of pinpointing this precisely. All that is a bit
unfortunate in some sense, but I think we're just going to have to
keep moving forward and hope for the best.
I can live with that.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 9:10 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Apr 12, 2016 at 10:30 PM, Noah Misch <noah@leadboat.com> wrote:
That sounds like this open item is ready for CLOSE_WAIT status; is it?
I just retested this on power2. Here are the results. I retested
3fed4174 and 6150a1b0 plus master as of deb71fa9. 5-minute pgbench -S
runs, scale factor 300, with predictable prewarming to minimize
variation, as well as numactl --interleave. Each result is a median
of three.1 client: 3fed4174 = 13701.014931, 6150a1b0 = 13669.626916, master =
19685.571089
8 clients: 3fed4174 = 126676.357079, 6150a1b0 = 125239.911105, master
= 122940.079404
32 clients: 3fed4174 = 323989.685428, 6150a1b0 = 338638.095126, master
= 333656.861590
64 clients: 3fed4174 = 495434.372578, 6150a1b0 = 457794.475129, master
= 493034.922791
128 clients: 3fed4174 = 376412.090366, 6150a1b0 = 363157.294391,
master = 625498.280370On this test 8, 32, and 64 clients are coming out about the same as
3fed4174, but 1 client and 128 clients are dramatically improved with
current master. The 1-client result is a lot more surprising than the
128-client result; I don't know what's going on there. But anyway I
don't see a regression here.So, yes, I would say this should go to CLOSE_WAIT at this point,
unless Amit or somebody else turns up further evidence of a continuing
issue here.
Yes, I also think that this particular issue can be closed. However I felt
that the observation related to performance variation is still present as I
never need to perform prewarm or anything else to get consistent results
during my work in 9.5 or early 9.6. Also, Andres, Alexander and myself are
working on similar observation (run-to-run performance variation) in a
nearby thread [1]/messages/by-id/20160412160246.nyzil35w3wein5fm@alap3.anarazel.de.
[1]: /messages/by-id/20160412160246.nyzil35w3wein5fm@alap3.anarazel.de
/messages/by-id/20160412160246.nyzil35w3wein5fm@alap3.anarazel.de
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Wed, Apr 13, 2016 at 11:22 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Yes, I also think that this particular issue can be closed. However I felt
that the observation related to performance variation is still present as I
never need to perform prewarm or anything else to get consistent results
during my work in 9.5 or early 9.6. Also, Andres, Alexander and myself are
working on similar observation (run-to-run performance variation) in a
nearby thread [1].
Yeah. My own measurements do not seem to support the idea that the
variance recently increased, but I haven't tested incredibly widely.
It may be that whatever is causing the variance is something that used
to be hidden by locking bottlenecks and now no longer is.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers