GiST index build missing smgrimmedsync()?

Started by Melanie Plagemanalmost 4 years ago2 messages
#1Melanie Plageman
melanieplageman@gmail.com

I brought this up in [1]/messages/by-id/CAAKRu_ZmkCNz-=5U5wZyyi3=Usfs1btsTKc_L9r1ceFvbxE3kg@mail.gmail.com but wanted to start a dedicated thread.

Since 16fa9b2b30a357 GiST indexes are not built in shared buffers.
However, smgrimmedsync() is not done anywhere and skipFsync=true is
always passed to smgrwrite() and smgrextend(). So, if a checkpoint
starts and finishes after WAL-logging some of the index build pages and
there is a crash sometime before the dirty pages make it to permanent
storage, won't that data be lost?

(Also, as Heikki has pointed out before, passing skipFsync=false isn't
sufficient either if checkpoint finishes before
register_dirty_segment(), so seems like smgrimmesync() is needed.)

Seems like the following would address this:

diff --git a/src/backend/access/gist/gistbuild.c
b/src/backend/access/gist/gistbuild.c
index 9854116fca..364a7455ea 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -454,9 +454,11 @@ gist_indexsortbuild(GISTBuildState *state)
        smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM,
GIST_ROOT_BLKNO,
                          pagestate->page, true);
        if (RelationNeedsWAL(state->indexrel))
+       {
                log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM,
GIST_ROOT_BLKNO,
                                        pagestate->page, true);
-
+               smgrimmedsync(RelationGetSmgr(state->indexrel), MAIN_FORKNUM);
+       }
        pfree(pagestate->page);
        pfree(pagestate);
 }

- Melanie

[1]: /messages/by-id/CAAKRu_ZmkCNz-=5U5wZyyi3=Usfs1btsTKc_L9r1ceFvbxE3kg@mail.gmail.com

#2Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Melanie Plageman (#1)
Re: GiST index build missing smgrimmedsync()?

On 23/02/2022 23:30, Melanie Plageman wrote:

I brought this up in [1] but wanted to start a dedicated thread.

Since 16fa9b2b30a357 GiST indexes are not built in shared buffers.
However, smgrimmedsync() is not done anywhere and skipFsync=true is
always passed to smgrwrite() and smgrextend(). So, if a checkpoint
starts and finishes after WAL-logging some of the index build pages and
there is a crash sometime before the dirty pages make it to permanent
storage, won't that data be lost?

Yes, good catch!

Seems like the following would address this:

Committed essentially that, except that I put the smgrimmedsync in a
separate if-block, and copied the comment from the similar piece of code
from nbtsort.c to explain the issue.

Thanks!

- Heikki