Performance degradation of REFRESH MATERIALIZED VIEW
Hi,
While discussing freezing tuples during CTAS[1]/messages/by-id/FB1F5E2D-CBD1-4645-B74C-E0A1BFAE4AC8@vmware.com, we found that
heap_insert() with HEAP_INSERT_FROZEN brings performance degradation.
For instance, with Paul's patch that sets HEAP_INSERT_FROZEN to CTAS,
it took 12 sec whereas the code without the patch took 10 sec with the
following query:
create table t1 (a, b, c, d) as select i,i,i,i from
generate_series(1,20000000) i;
I've done a simple benchmark of REFRESH MATERIALIZED VIEW with the
following queries:
create table source as select generate_series(1, 50000000);
create materialized view mv as select * from source;
refresh materialized view mv;
The execution time of REFRESH MATERIALIZED VIEW are:
w/ HEAP_INSERT_FROZEN flag : 42 sec
w/o HEAP_INSERT_FROZEN flag : 33 sec
After investigation, I found that such performance degradation happens
on only HEAD code. It seems to me that commit 39b66a91b (and
7db0cd2145) is relevant that has heap_insert() set VM bits and
PD_ALL_VISIBLE if HEAP_INSERT_FROZEN is specified (so CCing Tomas
Vondra and authors). Since heap_insert() sets PD_ALL_VISIBLE to the
page when inserting a tuple for the first time on the page (around
L2133 in heapam.c), every subsequent heap_insert() on the page reads
and pins a VM buffer (see RelationGetBufferForTuple()). Reading and
pinning a VM buffer for every insertion is a very high cost. This
doesn't happen in heap_multi_insert() since it sets VM buffer after
filling the heap page with tuples. Therefore, there is no such
performance degradation between COPY and COPY FREEZE if they use
heap_multi_insert() (i.g., CIM_MULTI). Paul also reported it in that
thread.
As far as I read the thread and commit messages related to those
commits, they are intended to COPY FREEZE and I could not find any
discussion and mention about REFRESH MATERIALIZED VIEW. So I'm
concerned we didn't expect such performance degradation.
Setting VM bits and PD_ALL_VISIBLE at REFRESH MATERIALIZED VIEW would
be a good choice in some cases. Since materialized views are read-only
VM bits never be cleared after creation. So it might make sense for
users to pay a cost to set them at refresh (note that CREATE
MATERIALIZED VIEW doesn’t set VM bits since it’s internally treated as
CTAS). On the other hand, given this big performance degradation
(about 20%) users might want to rely on autovacuum so that VM bits are
set in the background. However, unlike COPY, there is no way to
disable freezing tuples for REFRESH MATERIALIZED VIEW. So every user
would be imposed on those costs and affected by that performance
degradation. I’m concerned that it could be a problem.
What do you think?
Regards,
[1]: /messages/by-id/FB1F5E2D-CBD1-4645-B74C-E0A1BFAE4AC8@vmware.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Hi,
On 2021-03-11 17:44:37 +0900, Masahiko Sawada wrote:
The execution time of REFRESH MATERIALIZED VIEW are:
w/ HEAP_INSERT_FROZEN flag : 42 sec
w/o HEAP_INSERT_FROZEN flag : 33 secAfter investigation, I found that such performance degradation happens
on only HEAD code. It seems to me that commit 39b66a91b (and
7db0cd2145) is relevant that has heap_insert() set VM bits and
PD_ALL_VISIBLE if HEAP_INSERT_FROZEN is specified (so CCing Tomas
Vondra and authors). Since heap_insert() sets PD_ALL_VISIBLE to the
page when inserting a tuple for the first time on the page (around
L2133 in heapam.c), every subsequent heap_insert() on the page reads
and pins a VM buffer (see RelationGetBufferForTuple()). Reading and
pinning a VM buffer for every insertion is a very high cost. This
doesn't happen in heap_multi_insert() since it sets VM buffer after
filling the heap page with tuples. Therefore, there is no such
performance degradation between COPY and COPY FREEZE if they use
heap_multi_insert() (i.g., CIM_MULTI). Paul also reported it in that
thread.
Probably worth adding as an open item for 14.
Greetings,
Andres Freund
On Fri, Mar 12, 2021 at 3:13 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2021-03-11 17:44:37 +0900, Masahiko Sawada wrote:
The execution time of REFRESH MATERIALIZED VIEW are:
w/ HEAP_INSERT_FROZEN flag : 42 sec
w/o HEAP_INSERT_FROZEN flag : 33 secAfter investigation, I found that such performance degradation happens
on only HEAD code. It seems to me that commit 39b66a91b (and
7db0cd2145) is relevant that has heap_insert() set VM bits and
PD_ALL_VISIBLE if HEAP_INSERT_FROZEN is specified (so CCing Tomas
Vondra and authors). Since heap_insert() sets PD_ALL_VISIBLE to the
page when inserting a tuple for the first time on the page (around
L2133 in heapam.c), every subsequent heap_insert() on the page reads
and pins a VM buffer (see RelationGetBufferForTuple()). Reading and
pinning a VM buffer for every insertion is a very high cost. This
doesn't happen in heap_multi_insert() since it sets VM buffer after
filling the heap page with tuples. Therefore, there is no such
performance degradation between COPY and COPY FREEZE if they use
heap_multi_insert() (i.g., CIM_MULTI). Paul also reported it in that
thread.Probably worth adding as an open item for 14.
I've added it to PostgreSQL 14 Open Items.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
.
On Thu, Mar 11, 2021 at 5:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi,
While discussing freezing tuples during CTAS[1], we found that
heap_insert() with HEAP_INSERT_FROZEN brings performance degradation.
For instance, with Paul's patch that sets HEAP_INSERT_FROZEN to CTAS,
it took 12 sec whereas the code without the patch took 10 sec with the
following query:create table t1 (a, b, c, d) as select i,i,i,i from
generate_series(1,20000000) i;I've done a simple benchmark of REFRESH MATERIALIZED VIEW with the
following queries:create table source as select generate_series(1, 50000000);
create materialized view mv as select * from source;
refresh materialized view mv;The execution time of REFRESH MATERIALIZED VIEW are:
w/ HEAP_INSERT_FROZEN flag : 42 sec
w/o HEAP_INSERT_FROZEN flag : 33 secAfter investigation, I found that such performance degradation happens
on only HEAD code. It seems to me that commit 39b66a91b (and
7db0cd2145) is relevant that has heap_insert() set VM bits and
PD_ALL_VISIBLE if HEAP_INSERT_FROZEN is specified (so CCing Tomas
Vondra and authors). Since heap_insert() sets PD_ALL_VISIBLE to the
page when inserting a tuple for the first time on the page (around
L2133 in heapam.c), every subsequent heap_insert() on the page reads
and pins a VM buffer (see RelationGetBufferForTuple()).
IIUC RelationGetBufferForTuple() pins vm buffer if the page is
all-visible since the caller might clear vm bit during operation. But
it's not necessarily true in HEAP_FROZEN_INSERT case. When inserting
HEAP_FROZEN_INSERT, we might set PD_ALL_VISIBLE flag and all-visible
bit but never clear those flag and bit during insertion. Therefore to
fix this issue, I think we can have RelationGetBufferForTuple() not to
pin vm buffer if we're inserting a frozen tuple (i.g.,
HEAP_FROZEN_INSERT case) and the target page is already all-visible.
In HEAP_FROZEN_INSERT, the cases where we need to pin vm buffer would
be the table is empty. That way, we will pin vm buffer only for the
first time of inserting frozen tuple into the empty page, then set
PD_ALL_VISIBLE to the page and all-frozen bit on vm. Also set
XLH_INSERT_ALL_FROZEN_SET to WAL. At further insertions, we would not
pin vm buffer as long as we’re inserting a frozen tuple into the same
page.
If the target page is neither empty nor all-visible we will not pin vm
buffer, which is fine because if the page has non-frozen tuple we
cannot set bit on vm during heap_insert(). If all tuples on the page
are already frozen but PD_ALL_VISIBLE is not set for some reason, we
would be able to set all-frozen bit on vm but it seems not a good idea
since it requires checking during insertion if all existing tuples are
frozen or not.
The attached patch does the above idea. With this patch, the same
performance tests took 33 sec.
Also, I've measured the number of page read during REFRESH
MATERIALIZED VIEW using by pg_stat_statements. There were big
different on shared_blks_hit on pg_stat_statements:
1. w/ HEAP_INSERT_FROZEN flag (HEAD) : 50221781
2. w/ HEAP_INSERT_FROZEN flag (HEAD) : 221782
3. Patched: 443014
Since the 'source' table has 50000000 and each heap_insert() read vm
buffer, test 1 read pages as many as the number of insertion tuples.
The value of test 3 is about twice as much as the one of test 2. This
is because heap_insert() read the vm buffer for each first insertion
to the page. The table has 221239 blocks.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
skip_vmbuffer_for_frozen_tuple_insertion.patchapplication/octet-stream; name=skip_vmbuffer_for_frozen_tuple_insertion.patchDownload+40-22
At Mon, 12 Apr 2021 15:20:41 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
.
On Thu, Mar 11, 2021 at 5:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi,
While discussing freezing tuples during CTAS[1], we found that
heap_insert() with HEAP_INSERT_FROZEN brings performance degradation.
For instance, with Paul's patch that sets HEAP_INSERT_FROZEN to CTAS,
it took 12 sec whereas the code without the patch took 10 sec with the
following query:create table t1 (a, b, c, d) as select i,i,i,i from
generate_series(1,20000000) i;I've done a simple benchmark of REFRESH MATERIALIZED VIEW with the
following queries:create table source as select generate_series(1, 50000000);
create materialized view mv as select * from source;
refresh materialized view mv;The execution time of REFRESH MATERIALIZED VIEW are:
w/ HEAP_INSERT_FROZEN flag : 42 sec
w/o HEAP_INSERT_FROZEN flag : 33 secAfter investigation, I found that such performance degradation happens
on only HEAD code. It seems to me that commit 39b66a91b (and
7db0cd2145) is relevant that has heap_insert() set VM bits and
PD_ALL_VISIBLE if HEAP_INSERT_FROZEN is specified (so CCing Tomas
Vondra and authors). Since heap_insert() sets PD_ALL_VISIBLE to the
page when inserting a tuple for the first time on the page (around
L2133 in heapam.c), every subsequent heap_insert() on the page reads
and pins a VM buffer (see RelationGetBufferForTuple()).IIUC RelationGetBufferForTuple() pins vm buffer if the page is
all-visible since the caller might clear vm bit during operation. But
it's not necessarily true in HEAP_FROZEN_INSERT case. When inserting
HEAP_FROZEN_INSERT, we might set PD_ALL_VISIBLE flag and all-visible
bit but never clear those flag and bit during insertion. Therefore to
fix this issue, I think we can have RelationGetBufferForTuple() not to
pin vm buffer if we're inserting a frozen tuple (i.g.,
HEAP_FROZEN_INSERT case) and the target page is already all-visible.
It seems right to me.
In HEAP_FROZEN_INSERT, the cases where we need to pin vm buffer would
be the table is empty. That way, we will pin vm buffer only for the
first time of inserting frozen tuple into the empty page, then set
PD_ALL_VISIBLE to the page and all-frozen bit on vm. Also set
XLH_INSERT_ALL_FROZEN_SET to WAL. At further insertions, we would not
pin vm buffer as long as we’re inserting a frozen tuple into the same
page.If the target page is neither empty nor all-visible we will not pin vm
buffer, which is fine because if the page has non-frozen tuple we
cannot set bit on vm during heap_insert(). If all tuples on the page
are already frozen but PD_ALL_VISIBLE is not set for some reason, we
would be able to set all-frozen bit on vm but it seems not a good idea
since it requires checking during insertion if all existing tuples are
frozen or not.The attached patch does the above idea. With this patch, the same
performance tests took 33 sec.
Great! The direction of the patch looks fine to me.
+ * If we're inserting frozen entry into empty page, we will set
+ * all-visible to page and all-frozen on visibility map.
+ */
+ if (PageGetMaxOffsetNumber(page) == 0)
all_frozen_set = true;
AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.
And, the patch changes the value of all_frozen_set to false when the
page was already all-frozen (thus not empty). It would be fine since
we don't need to change the visibility of the page in that case but
the variable name is no longer correct. set_all_visible or such?
Also, I've measured the number of page read during REFRESH
MATERIALIZED VIEW using by pg_stat_statements. There were big
different on shared_blks_hit on pg_stat_statements:1. w/ HEAP_INSERT_FROZEN flag (HEAD) : 50221781
2. w/ HEAP_INSERT_FROZEN flag (HEAD) : 221782
3. Patched: 443014Since the 'source' table has 50000000 and each heap_insert() read vm
buffer, test 1 read pages as many as the number of insertion tuples.
The value of test 3 is about twice as much as the one of test 2. This
is because heap_insert() read the vm buffer for each first insertion
to the page. The table has 221239 blocks.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 12 Apr 2021 15:20:41 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
.
On Thu, Mar 11, 2021 at 5:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi,
While discussing freezing tuples during CTAS[1], we found that
heap_insert() with HEAP_INSERT_FROZEN brings performance degradation.
For instance, with Paul's patch that sets HEAP_INSERT_FROZEN to CTAS,
it took 12 sec whereas the code without the patch took 10 sec with the
following query:create table t1 (a, b, c, d) as select i,i,i,i from
generate_series(1,20000000) i;I've done a simple benchmark of REFRESH MATERIALIZED VIEW with the
following queries:create table source as select generate_series(1, 50000000);
create materialized view mv as select * from source;
refresh materialized view mv;The execution time of REFRESH MATERIALIZED VIEW are:
w/ HEAP_INSERT_FROZEN flag : 42 sec
w/o HEAP_INSERT_FROZEN flag : 33 secAfter investigation, I found that such performance degradation happens
on only HEAD code. It seems to me that commit 39b66a91b (and
7db0cd2145) is relevant that has heap_insert() set VM bits and
PD_ALL_VISIBLE if HEAP_INSERT_FROZEN is specified (so CCing Tomas
Vondra and authors). Since heap_insert() sets PD_ALL_VISIBLE to the
page when inserting a tuple for the first time on the page (around
L2133 in heapam.c), every subsequent heap_insert() on the page reads
and pins a VM buffer (see RelationGetBufferForTuple()).IIUC RelationGetBufferForTuple() pins vm buffer if the page is
all-visible since the caller might clear vm bit during operation. But
it's not necessarily true in HEAP_FROZEN_INSERT case. When inserting
HEAP_FROZEN_INSERT, we might set PD_ALL_VISIBLE flag and all-visible
bit but never clear those flag and bit during insertion. Therefore to
fix this issue, I think we can have RelationGetBufferForTuple() not to
pin vm buffer if we're inserting a frozen tuple (i.g.,
HEAP_FROZEN_INSERT case) and the target page is already all-visible.It seems right to me.
In HEAP_FROZEN_INSERT, the cases where we need to pin vm buffer would
be the table is empty. That way, we will pin vm buffer only for the
first time of inserting frozen tuple into the empty page, then set
PD_ALL_VISIBLE to the page and all-frozen bit on vm. Also set
XLH_INSERT_ALL_FROZEN_SET to WAL. At further insertions, we would not
pin vm buffer as long as we’re inserting a frozen tuple into the same
page.If the target page is neither empty nor all-visible we will not pin vm
buffer, which is fine because if the page has non-frozen tuple we
cannot set bit on vm during heap_insert(). If all tuples on the page
are already frozen but PD_ALL_VISIBLE is not set for some reason, we
would be able to set all-frozen bit on vm but it seems not a good idea
since it requires checking during insertion if all existing tuples are
frozen or not.The attached patch does the above idea. With this patch, the same
performance tests took 33 sec.
Thank you for the comments.
Great! The direction of the patch looks fine to me.
+ * If we're inserting frozen entry into empty page, we will set + * all-visible to page and all-frozen on visibility map. + */ + if (PageGetMaxOffsetNumber(page) == 0) all_frozen_set = true;AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.
There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix.
And, the patch changes the value of all_frozen_set to false when the
page was already all-frozen (thus not empty). It would be fine since
we don't need to change the visibility of the page in that case but
the variable name is no longer correct. set_all_visible or such?
It seems to me that the variable name all_frozen_set corresponds to
XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
set_all_frozen instead since we set all-frozen bits (also implying
setting all-visible)?
BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
there is no all_visible_set anywhere:
/* all_frozen_set always implies all_visible_set */
#define XLH_INSERT_ALL_FROZEN_SET (1<<5)
I'll update those comments as well.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix.
Yes. It seems to me that it is cleaner that RelationGetBufferForTuple
returns vmbuffer only when the caller needs to change vm state.
Thanks.
And, the patch changes the value of all_frozen_set to false when the
page was already all-frozen (thus not empty). It would be fine since
we don't need to change the visibility of the page in that case but
the variable name is no longer correct. set_all_visible or such?It seems to me that the variable name all_frozen_set corresponds to
XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
set_all_frozen instead since we set all-frozen bits (also implying
setting all-visible)?
Right. However, "if (set_all_frozen) then "set all_visible" looks like
a bug^^;. all_frozen_set looks better in that context than
set_all_frozen. So I withdraw the comment.
BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
there is no all_visible_set anywhere:/* all_frozen_set always implies all_visible_set */
#define XLH_INSERT_ALL_FROZEN_SET (1<<5)I'll update those comments as well.
FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
to be set together". The current comment is working to me.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, Apr 19, 2021 at 5:04 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix.Yes. It seems to me that it is cleaner that RelationGetBufferForTuple
returns vmbuffer only when the caller needs to change vm state.
Thanks.And, the patch changes the value of all_frozen_set to false when the
page was already all-frozen (thus not empty). It would be fine since
we don't need to change the visibility of the page in that case but
the variable name is no longer correct. set_all_visible or such?It seems to me that the variable name all_frozen_set corresponds to
XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
set_all_frozen instead since we set all-frozen bits (also implying
setting all-visible)?Right. However, "if (set_all_frozen) then "set all_visible" looks like
a bug^^;. all_frozen_set looks better in that context than
set_all_frozen. So I withdraw the comment.BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
there is no all_visible_set anywhere:/* all_frozen_set always implies all_visible_set */
#define XLH_INSERT_ALL_FROZEN_SET (1<<5)I'll update those comments as well.
FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
to be set together". The current comment is working to me.
Okay, I've updated the patch accordingly. Please review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
skip_vmbuffer_for_frozen_tuple_insertion_v2.patchapplication/octet-stream; name=skip_vmbuffer_for_frozen_tuple_insertion_v2.patchDownload+48-22
On Mon, Apr 19, 2021 at 1:57 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Apr 19, 2021 at 5:04 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix.Yes. It seems to me that it is cleaner that RelationGetBufferForTuple
returns vmbuffer only when the caller needs to change vm state.
Thanks.And, the patch changes the value of all_frozen_set to false when the
page was already all-frozen (thus not empty). It would be fine since
we don't need to change the visibility of the page in that case but
the variable name is no longer correct. set_all_visible or such?It seems to me that the variable name all_frozen_set corresponds to
XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
set_all_frozen instead since we set all-frozen bits (also implying
setting all-visible)?Right. However, "if (set_all_frozen) then "set all_visible" looks like
a bug^^;. all_frozen_set looks better in that context than
set_all_frozen. So I withdraw the comment.BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
there is no all_visible_set anywhere:/* all_frozen_set always implies all_visible_set */
#define XLH_INSERT_ALL_FROZEN_SET (1<<5)I'll update those comments as well.
FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
to be set together". The current comment is working to me.Okay, I've updated the patch accordingly. Please review it.
I was reading the patch, just found some typos: it should be "a frozen
tuple" not "an frozen tuple".
+ * Also pin visibility map page if we're inserting an frozen tuple into
+ * If we're inserting an frozen entry into empty page, pin the
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On Mon, Apr 19, 2021 at 8:04 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Mon, Apr 19, 2021 at 1:57 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Apr 19, 2021 at 5:04 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix.Yes. It seems to me that it is cleaner that RelationGetBufferForTuple
returns vmbuffer only when the caller needs to change vm state.
Thanks.And, the patch changes the value of all_frozen_set to false when the
page was already all-frozen (thus not empty). It would be fine since
we don't need to change the visibility of the page in that case but
the variable name is no longer correct. set_all_visible or such?It seems to me that the variable name all_frozen_set corresponds to
XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
set_all_frozen instead since we set all-frozen bits (also implying
setting all-visible)?Right. However, "if (set_all_frozen) then "set all_visible" looks like
a bug^^;. all_frozen_set looks better in that context than
set_all_frozen. So I withdraw the comment.BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
there is no all_visible_set anywhere:/* all_frozen_set always implies all_visible_set */
#define XLH_INSERT_ALL_FROZEN_SET (1<<5)I'll update those comments as well.
FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
to be set together". The current comment is working to me.Okay, I've updated the patch accordingly. Please review it.
I was reading the patch, just found some typos: it should be "a frozen
tuple" not "an frozen tuple".+ * Also pin visibility map page if we're inserting an frozen tuple into + * If we're inserting an frozen entry into empty page, pin the
Thank you for the comment.
I’ve updated the patch including the above comment.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
skip_vmbuffer_for_frozen_tuple_insertion_v3.patchapplication/x-patch; name=skip_vmbuffer_for_frozen_tuple_insertion_v3.patchDownload+63-22
On Mon, Apr 19, 2021 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I’ve updated the patch including the above comment.
Thanks for the patch.
I was trying to understand below statements:
+ * we check without a buffer lock if the page is empty but the
+ * caller doesn't need to recheck that since we assume that in
+ * HEAP_INSERT_FROZEN case, only one process is inserting a
+ * frozen tuple into this relation.
+ *
And earlier comments from upthread:
AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.
There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix."
I'm not sure whether it is safe to assume "at least for now since only
one process inserts tuples into the relation". What if we allow
parallel inserts for HEAP_INSERT_FROZEN cases, I don't know whether we
can do that or not. Correct me if I'm wrong.
While we are modifying something in heap_insert:
1) Can we adjust the comment below in heap_insert to the 80char limit?
* If we're inserting frozen entry into an empty page,
* set visibility map bits and PageAllVisible() hint.
2) I'm thinking whether we can do page = BufferGetPage(buffer); after
RelationGetBufferForTuple and use in all the places where currently
BufferGetPage(buffer) is being used:
if (PageIsAllVisible(BufferGetPage(buffer)),
PageClearAllVisible(BufferGetPage(buffer)); and we could even remove
the local variable page of if (RelationNeedsWAL(relation)).
3) We could as well get the block number once and use it in all the
places in heap_insert, thus we can remove extra calls of
BufferGetBlockNumber(buffer).
Thoughts?
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 20, 2021 at 11:04 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Mon, Apr 19, 2021 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I’ve updated the patch including the above comment.
Thanks for the patch.
I was trying to understand below statements: + * we check without a buffer lock if the page is empty but the + * caller doesn't need to recheck that since we assume that in + * HEAP_INSERT_FROZEN case, only one process is inserting a + * frozen tuple into this relation. + *And earlier comments from upthread:
AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix."I'm not sure whether it is safe to assume "at least for now since only
one process inserts tuples into the relation". What if we allow
parallel inserts for HEAP_INSERT_FROZEN cases, I don't know whether we
can do that or not. Correct me if I'm wrong.
I think if my assumption is wrong or we allow parallel insert for
HEAP_INSERT_FROZEN cases in the future, we need to deal with the case
where frozen tuples are concurrently inserted into the same page. For
example, we can release vmbuffer when we see the page is no longer
empty, or we can return a valid buffer but require the caller to
re-check if the page is still empty. The previous version patch took
the former approach. More concretely, heap_insert() rechecked if the
page is still empty in HEAP_INSERT_FROZEN case and set all_frozen_set
if so. But AFAICS concurrently inserting frozen tuples into the same
page doesn’t happen for now (COPY FREEZE and REFRESH MATERIALIZED VIEW
are users of HEAP_INSERT_FROZEN), also pointed out by Horiguchi-san.
So I added comments and assertions rather than addressing the case
that never happens with the current code. If concurrently inserting
frozen tuples into the same page happens, we should get the assertion
failure that I added in RelationGetBufferForTuple().
While we are modifying something in heap_insert:
1) Can we adjust the comment below in heap_insert to the 80char limit?
* If we're inserting frozen entry into an empty page,
* set visibility map bits and PageAllVisible() hint.
2) I'm thinking whether we can do page = BufferGetPage(buffer); after
RelationGetBufferForTuple and use in all the places where currently
BufferGetPage(buffer) is being used:
if (PageIsAllVisible(BufferGetPage(buffer)),
PageClearAllVisible(BufferGetPage(buffer)); and we could even remove
the local variable page of if (RelationNeedsWAL(relation)).
3) We could as well get the block number once and use it in all the
places in heap_insert, thus we can remove extra calls of
BufferGetBlockNumber(buffer).
All points are reasonable to me. I'll incorporate them in the next version.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Apr 20, 2021 at 11:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Apr 20, 2021 at 11:04 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Mon, Apr 19, 2021 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I’ve updated the patch including the above comment.
Thanks for the patch.
I was trying to understand below statements: + * we check without a buffer lock if the page is empty but the + * caller doesn't need to recheck that since we assume that in + * HEAP_INSERT_FROZEN case, only one process is inserting a + * frozen tuple into this relation. + *And earlier comments from upthread:
AFAICS the page is always empty when RelationGetBufferForTuple
returned a valid vmbuffer. So the "if" should be an "assert" instead.There is a chance that RelationGetBufferForTuple() returns a valid
vmbuffer but the page is not empty, since RelationGetBufferForTuple()
checks without a lock if the page is empty. But when it comes to
HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
since only one process inserts tuples into the relation. Will fix."I'm not sure whether it is safe to assume "at least for now since only
one process inserts tuples into the relation". What if we allow
parallel inserts for HEAP_INSERT_FROZEN cases, I don't know whether we
can do that or not. Correct me if I'm wrong.I think if my assumption is wrong or we allow parallel insert for
HEAP_INSERT_FROZEN cases in the future, we need to deal with the case
where frozen tuples are concurrently inserted into the same page. For
example, we can release vmbuffer when we see the page is no longer
empty, or we can return a valid buffer but require the caller to
re-check if the page is still empty. The previous version patch took
the former approach. More concretely, heap_insert() rechecked if the
page is still empty in HEAP_INSERT_FROZEN case and set all_frozen_set
if so. But AFAICS concurrently inserting frozen tuples into the same
page doesn’t happen for now (COPY FREEZE and REFRESH MATERIALIZED VIEW
are users of HEAP_INSERT_FROZEN), also pointed out by Horiguchi-san.
So I added comments and assertions rather than addressing the case
that never happens with the current code. If concurrently inserting
frozen tuples into the same page happens, we should get the assertion
failure that I added in RelationGetBufferForTuple().
Upon thinking further, concurrent insertions into the same page are
not possible while we are in heap_insert in between
RelationGetBufferForTuple and UnlockReleaseBuffer(buffer);.
RelationGetBufferForTuple will lock the buffer in exclusive mode, see
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); and comment " * Returns
pinned and exclusive-locked buffer of a page in given relation". Even
if parallel insertions are allowed in HEAP_INSERT_FROZEN cases, then
each worker will separately acquire pages, insert into them and they
skip getting visibility map page pin if the page is set all-visible by
another worker.
Some more comments on v3 patch:
1) Isn't it good to specify here that what we gain by avoiding pinning
visibility map page something like: gain a few seconds/avoid extra
function calls/or some other better wording?
+ * If the page already is non-empty and all-visible, we skip to
+ * pin on a visibility map buffer since we never clear and set
+ * all-frozen bit on visibility map during inserting a frozen
+ * tuple.
+ */
2) Isn't it good to put PageIsAllVisible(BufferGetPage(buffer))) in
the if clause instead of else if clause, because this is going to be
hitting most of the time, we could avoid page empty check every time?
+ if (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0)
+ visibilitymap_pin(relation, targetBlock, vmbuffer);
+ else if (PageIsAllVisible(BufferGetPage(buffer)))
+ skip_vmbuffer = true;
3) I found another typo in v3 - it is "will set" not "will sets":
+ * In HEAP_INSERT_FROZEN cases, we handle the possibility that the
caller will
+ * sets all-frozen bit on the visibility map page. We pin on the visibility
4) I think a commit message can be added to the upcoming patch.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Hi,
I took a look at this today, as I committed 39b66a91b back in January. I
can reproduce the issue, with just 1M rows the before/after timings are
roughly 480ms and 620ms on my hardware.
Unfortunately, the v3 patch does not really fix the issue for me. The
timing with it applied is ~610ms so the improvement is only minimal.
I'm not sure what to do about this :-( I don't have any ideas about how
to eliminate this overhead, so the only option I see is reverting the
changes in heap_insert. Unfortunately, that'd mean inserts into TOAST
tables won't be frozen ...
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
I'm not sure what to do about this :-( I don't have any ideas about how to
eliminate this overhead, so the only option I see is reverting the changes
in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
be frozen ...
ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.
It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.
And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.
Greetings,
Andres Freund
On 4/26/21 9:27 PM, Andres Freund wrote:
Hi,
On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
I'm not sure what to do about this :-( I don't have any ideas about how to
eliminate this overhead, so the only option I see is reverting the changes
in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
be frozen ...ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.
Yeah. The question still is what to do about 14, though. Shall we leave
the code as it is now, or should we change it somehow? It seem a bit
unfortunate that a COPY FREEZE optimization should negatively influence
other (more) common use cases, so I guess we can't just keep the current
code ...
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
On 4/26/21 9:27 PM, Andres Freund wrote:
On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
I'm not sure what to do about this :-( I don't have any ideas about how to
eliminate this overhead, so the only option I see is reverting the changes
in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
be frozen ...ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.Yeah. The question still is what to do about 14, though. Shall we leave the
code as it is now, or should we change it somehow? It seem a bit unfortunate
that a COPY FREEZE optimization should negatively influence other (more)
common use cases, so I guess we can't just keep the current code ...
I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
and see whether that fixes the regression. If it does, then we can
analyze whether that's possibly the best way forward. Or whether we
revert, live with the regression or find yet another path.
Greetings,
Andres Freund
On Mon, Apr 26, 2021 at 10:31 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Hi,
I took a look at this today, as I committed 39b66a91b back in January. I
can reproduce the issue, with just 1M rows the before/after timings are
roughly 480ms and 620ms on my hardware.Unfortunately, the v3 patch does not really fix the issue for me. The
timing with it applied is ~610ms so the improvement is only minimal.
Since the reading vmbuffer is likely to hit on the shared buffer
during inserting frozen tuples, I think the improvement would not be
visible with a few million tuples depending on hardware. But it might
not be as fast as before commit 39b66a91b since we read vmbuffer at
least per insertion.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Apr 27, 2021 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
On 4/26/21 9:27 PM, Andres Freund wrote:
On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
I'm not sure what to do about this :-( I don't have any ideas about how to
eliminate this overhead, so the only option I see is reverting the changes
in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
be frozen ...ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.Yeah. The question still is what to do about 14, though. Shall we leave the
code as it is now, or should we change it somehow? It seem a bit unfortunate
that a COPY FREEZE optimization should negatively influence other (more)
common use cases, so I guess we can't just keep the current code ...I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
and see whether that fixes the regression.
Is this idea to have RelationGetBufferForTuple() skip re-pinning
vmbuffer? If so, is this essentially the same as the one in the v3
patch?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On 4/27/21 7:34 AM, Masahiko Sawada wrote:
On Tue, Apr 27, 2021 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
On 4/26/21 9:27 PM, Andres Freund wrote:
On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
I'm not sure what to do about this :-( I don't have any ideas about how to
eliminate this overhead, so the only option I see is reverting the changes
in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
be frozen ...ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.Yeah. The question still is what to do about 14, though. Shall we leave the
code as it is now, or should we change it somehow? It seem a bit unfortunate
that a COPY FREEZE optimization should negatively influence other (more)
common use cases, so I guess we can't just keep the current code ...I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
and see whether that fixes the regression.Is this idea to have RelationGetBufferForTuple() skip re-pinning
vmbuffer? If so, is this essentially the same as the one in the v3
patch?
I don't think it is the same approach - it's a bit hard to follow what
exactly happens in RelationGetBufferForTuple, but AFAICS it always
starts with vmbuffer = InvalidBuffer, so it may pin the vmbuffer quite
often, no?
What Andres is suggesting (I think) is to modify ExecInsert() to pass a
valid bistate to table_tuple_insert, instead of just NULL, and store the
vmbuffer in it. Not sure how to identify when inserting more than just a
single row, though ...
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company