Adding SMGR discriminator to buffer tags
Hello hackers,
On another thread, lots of undo log-related patches have been traded.
Buried deep in the stack is one that I'd like to highlight and discuss
in a separate thread, because it relates to a parallel thread of
development and it'd be good to get feedback on it.
In commit 3eb77eba, Shawn Debnath and I extended the checkpointer
fsync machinery to support more kinds of files. Next, we'd like to
teach the buffer pool to deal with more kinds of buffers. The context
for this collaboration is that he's working on putting things like
CLOG into shared buffers, and my EDB colleagues and I are working on
putting undo logs into shared buffers. We want a simple way to put
any block-structured stuff into shared buffers, not just plain
"relations".
The questions are: how should buffer tags distinguish different kinds
of buffers, and how should SMGR direct IO traffic to the right place
when it needs to schlepp pages in and out?
In earlier prototype code, I'd been using a special database number
for undo logs. In a recent thread[1]/messages/by-id/CA+hUKG+DE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo+83vvw@mail.gmail.com, Tom and others didn't like that
idea much, and Shawn mentioned his colleague's idea of stealing unused
bits from the fork number so that there is no net change in tag size,
but we have entirely separate namespaces for each kind of buffered
data.
Here's a patch that does that, and then makes changes in the main
places I have found so far that need to be aware of the new SMGR ID
field.
Thoughts?
[1]: /messages/by-id/CA+hUKG+DE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo+83vvw@mail.gmail.com
--
Thomas Munro
https://enterprisedb.com
Attachments:
0002-Move-tablespace-dir-creation-from-smgr.c-to-md.c.patchapplication/octet-stream; name=0002-Move-tablespace-dir-creation-from-smgr.c-to-md.c.patchDownload
From dc05a625bffd5af3cbc70d2751068f4caa0d22cf Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 30 Apr 2019 22:11:03 +1200
Subject: [PATCH 02/16] Move tablespace dir creation from smgr.c to md.c.
For undo logs, we don't need to create tablespace directories when
opening a relation, because that is managed automatically by
undolog.c.
Author: Thomas Munro
---
src/backend/storage/smgr/md.c | 14 ++++++++++++++
src/backend/storage/smgr/smgr.c | 14 --------------
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 622315fbd16..0a597260ea9 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,6 +28,7 @@
#include "miscadmin.h"
#include "access/xlogutils.h"
#include "access/xlog.h"
+#include "commands/tablespace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
@@ -196,6 +197,19 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
Assert(reln->md_num_open_segs[forkNum] == 0);
+ /*
+ * We may be using the target table space for the first time in this
+ * database, so create a per-database subdirectory if needed.
+ *
+ * XXX this is a fairly ugly violation of module layering, but this seems
+ * to be the best place to put the check. Maybe TablespaceCreateDbspace
+ * should be here and not in commands/tablespace.c? But that would imply
+ * importing a lot of stuff that smgr.c oughtn't know, either.
+ */
+ TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
+ reln->smgr_rnode.node.dbNode,
+ isRedo);
+
path = relpath(reln->smgr_rnode, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index ef3907f2a79..0968e0c99c1 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -17,7 +17,6 @@
*/
#include "postgres.h"
-#include "commands/tablespace.h"
#include "lib/ilist.h"
#include "storage/bufmgr.h"
#include "storage/ipc.h"
@@ -343,19 +342,6 @@ smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
if (isRedo && reln->md_num_open_segs[forknum] > 0)
return;
- /*
- * We may be using the target table space for the first time in this
- * database, so create a per-database subdirectory if needed.
- *
- * XXX this is a fairly ugly violation of module layering, but this seems
- * to be the best place to put the check. Maybe TablespaceCreateDbspace
- * should be here and not in commands/tablespace.c? But that would imply
- * importing a lot of stuff that smgr.c oughtn't know, either.
- */
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- isRedo);
-
smgrsw[reln->smgr_which].smgr_create(reln, forknum, isRedo);
}
--
2.21.0
0001-Add-SmgrId-to-smgropen-and-BufferTag.patchapplication/octet-stream; name=0001-Add-SmgrId-to-smgropen-and-BufferTag.patchDownload
From 34f185094a7e381018769376f39e7857b028347b Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 8 Mar 2019 12:03:00 +1300
Subject: [PATCH 01/16] Add SmgrId to smgropen() and BufferTag.
To use bufmgr.c for new kinds of data in addition to plain old
relations, add an SMGR argument to places that identify blocks
and the files that hold them (smgropen(), block references in
the WAL, BufferTag).
To avoid making BufferTag wider, take some space away from the
fork number for this new member, since there are just a few
values possible.
Add a "smgrid" column to the pg_buffercache extension.
Create a new callback for smgropen() calls so that some md.c-
specific stuff can move out of smgropen(), and future
implementations can also run their own initialization code.
Author: Thomas Munro
Discussion: https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
---
contrib/bloom/blinsert.c | 2 +-
contrib/pg_buffercache/Makefile | 4 +-
...cache--1.2.sql => pg_buffercache--1.4.sql} | 8 +--
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 49 ++++++++++---------
doc/src/sgml/pgbuffercache.sgml | 7 +++
src/backend/access/brin/brin_xlog.c | 2 +-
src/backend/access/gin/ginxlog.c | 3 +-
src/backend/access/gist/gistxlog.c | 6 +--
src/backend/access/hash/hash_xlog.c | 12 ++---
src/backend/access/hash/hashpage.c | 6 ++-
src/backend/access/heap/heapam.c | 20 ++++----
src/backend/access/heap/heapam_handler.c | 2 +-
src/backend/access/heap/rewriteheap.c | 6 ++-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 3 +-
src/backend/access/nbtree/nbtxlog.c | 8 +--
src/backend/access/spgist/spginsert.c | 6 +--
src/backend/access/spgist/spgxlog.c | 12 ++---
src/backend/access/transam/xlog.c | 6 ++-
src/backend/access/transam/xloginsert.c | 39 +++++++++------
src/backend/access/transam/xlogreader.c | 7 +++
src/backend/access/transam/xlogutils.c | 17 ++++---
src/backend/catalog/storage.c | 11 +++--
src/backend/commands/tablecmds.c | 2 +-
src/backend/replication/logical/decode.c | 10 ++--
.../replication/logical/reorderbuffer.c | 3 +-
src/backend/storage/buffer/bufmgr.c | 29 +++++++----
src/backend/storage/buffer/localbuf.c | 8 +--
src/backend/storage/freespace/freespace.c | 3 +-
src/backend/storage/freespace/fsmpage.c | 3 +-
src/backend/storage/smgr/md.c | 29 +++++++----
src/backend/storage/smgr/smgr.c | 13 +++--
src/bin/pg_rewind/parsexlog.c | 8 ++-
src/bin/pg_waldump/pg_waldump.c | 16 ++++--
src/include/access/xloginsert.h | 12 +++--
src/include/access/xlogreader.h | 6 ++-
src/include/access/xlogutils.h | 6 ++-
src/include/storage/buf_internals.h | 11 ++++-
src/include/storage/bufmgr.h | 5 +-
src/include/storage/md.h | 1 +
src/include/storage/smgr.h | 8 ++-
src/include/utils/rel.h | 6 ++-
43 files changed, 260 insertions(+), 159 deletions(-)
rename contrib/pg_buffercache/{pg_buffercache--1.2.sql => pg_buffercache--1.4.sql} (70%)
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 4b2186b8dda..e39d21df1f6 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -181,7 +181,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 18f7a874524..d76ac243d3e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -4,7 +4,9 @@ MODULE_big = pg_buffercache
OBJS = pg_buffercache_pages.o $(WIN32RES)
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
+DATA = \
+ pg_buffercache--1.4.sql \
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
pg_buffercache--unpackaged--1.0.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.4.sql
similarity index 70%
rename from contrib/pg_buffercache/pg_buffercache--1.2.sql
rename to contrib/pg_buffercache/pg_buffercache--1.4.sql
index 6ee5d8435bd..4a36b4275da 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.2.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.4.sql
@@ -1,4 +1,4 @@
-/* contrib/pg_buffercache/pg_buffercache--1.2.sql */
+/* contrib/pg_buffercache/pg_buffercache--1.4.sql */
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION pg_buffercache" to load this file. \quit
@@ -12,9 +12,9 @@ LANGUAGE C PARALLEL SAFE;
-- Create a view for convenient access.
CREATE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
- relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
- pinning_backends int4);
+ (bufferid integer, smgrid int2, relfilenode oid, reltablespace oid,
+ reldatabase oid, relforknumber int2, relblocknumber int8, isdirty bool,
+ usagecount int2, pinning_backends int4);
-- Don't want these to be available to public.
REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae9abf..a82ae5f9bb5 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579fcbb0..052764e46c6 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -15,8 +15,8 @@
#include "storage/bufmgr.h"
-#define NUM_BUFFERCACHE_PAGES_MIN_ELEM 8
-#define NUM_BUFFERCACHE_PAGES_ELEM 9
+#define NUM_BUFFERCACHE_PAGES_MIN_ELEM 10
+#define NUM_BUFFERCACHE_PAGES_ELEM 10
PG_MODULE_MAGIC;
@@ -25,6 +25,7 @@ PG_MODULE_MAGIC;
*/
typedef struct
{
+ SmgrId smgrid;
uint32 bufferid;
Oid relfilenode;
Oid reltablespace;
@@ -102,24 +103,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
tupledesc = CreateTemplateTupleDesc(expected_tupledesc->natts);
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 2, "smgrid",
+ INT2OID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 3, "relfilenode",
OIDOID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reltablespace",
OIDOID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 5, "reldatabase",
OIDOID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 5, "relforknumber",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 6, "relforknumber",
INT2OID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 6, "relblocknumber",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 7, "relblocknumber",
INT8OID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 7, "isdirty",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 8, "isdirty",
BOOLOID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 8, "usage_count",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 9, "usage_count",
INT2OID, -1, 0);
-
- if (expected_tupledesc->natts == NUM_BUFFERCACHE_PAGES_ELEM)
- TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
- INT4OID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 10, "pinning_backends",
+ INT4OID, -1, 0);
fctx->tupdesc = BlessTupleDesc(tupledesc);
@@ -153,6 +154,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
+ fctx->record[i].smgrid = bufHdr->tag.smgrid;
fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
@@ -204,28 +206,29 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
nulls[5] = true;
nulls[6] = true;
nulls[7] = true;
- /* unused for v1.0 callers, but the array is always long enough */
nulls[8] = true;
+ nulls[9] = true;
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = Int16GetDatum(fctx->record[i].smgrid);
nulls[1] = false;
- values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
+ values[2] = ObjectIdGetDatum(fctx->record[i].relfilenode);
nulls[2] = false;
- values[3] = ObjectIdGetDatum(fctx->record[i].reldatabase);
+ values[3] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[3] = false;
- values[4] = ObjectIdGetDatum(fctx->record[i].forknum);
+ values[4] = ObjectIdGetDatum(fctx->record[i].reldatabase);
nulls[4] = false;
- values[5] = Int64GetDatum((int64) fctx->record[i].blocknum);
+ values[5] = ObjectIdGetDatum(fctx->record[i].forknum);
nulls[5] = false;
- values[6] = BoolGetDatum(fctx->record[i].isdirty);
+ values[6] = Int64GetDatum((int64) fctx->record[i].blocknum);
nulls[6] = false;
- values[7] = Int16GetDatum(fctx->record[i].usagecount);
+ values[7] = BoolGetDatum(fctx->record[i].isdirty);
nulls[7] = false;
- /* unused for v1.0 callers, but the array is always long enough */
- values[8] = Int32GetDatum(fctx->record[i].pinning_backends);
+ values[8] = Int16GetDatum(fctx->record[i].usagecount);
nulls[8] = false;
+ values[9] = Int32GetDatum(fctx->record[i].pinning_backends);
+ nulls[9] = false;
}
/* Build and return the tuple. */
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index faf5a3115dc..a0a7be32b4b 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -57,6 +57,13 @@
<entry>ID, in the range 1..<varname>shared_buffers</varname></entry>
</row>
+ <row>
+ <entry><structfield>smgrid</structfield></entry>
+ <entry><type>smallint</type></entry>
+ <entry></entry>
+ <entry>Block storage manager ID. 0 for regular relation data.</entry>
+ </row>
+
<row>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c
index db1f47ca218..a13b3cd2575 100644
--- a/src/backend/access/brin/brin_xlog.c
+++ b/src/backend/access/brin/brin_xlog.c
@@ -217,7 +217,7 @@ brin_xlog_revmap_extend(XLogReaderState *record)
xlrec = (xl_brin_revmap_extend *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &targetBlk);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &targetBlk);
Assert(xlrec->targetBlk == targetBlk);
/* Update the metapage */
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index b648af1ff65..3c54dc6f369 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,11 +95,12 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
+ SmgrId smgrid;
RelFileNode node;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &smgrid, &node, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
node.spcNode, node.dbNode, node.relNode);
}
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index 503db34d863..bf945b9fb50 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -193,7 +193,7 @@ gistRedoDeleteRecord(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
}
@@ -278,7 +278,7 @@ gistRedoPageSplitRecord(XLogReaderState *record)
BlockNumber blkno;
IndexTuple *tuples;
- XLogRecGetBlockTag(record, i + 1, NULL, NULL, &blkno);
+ XLogRecGetBlockTag(record, i + 1, NULL, NULL, NULL, &blkno);
if (blkno == GIST_ROOT_BLKNO)
{
Assert(i == 0);
@@ -313,7 +313,7 @@ gistRedoPageSplitRecord(XLogReaderState *record)
{
BlockNumber nextblkno;
- XLogRecGetBlockTag(record, i + 2, NULL, NULL, &nextblkno);
+ XLogRecGetBlockTag(record, i + 2, NULL, NULL, NULL, &nextblkno);
GistPageGetOpaque(page)->rightlink = nextblkno;
}
else
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index d7b70981101..ec604a7d428 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -51,7 +51,7 @@ hash_xlog_init_meta_page(XLogReaderState *record)
* special handling for init forks as create index operations don't log a
* full page image of the metapage.
*/
- XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(metabuf);
@@ -89,7 +89,7 @@ hash_xlog_init_bitmap_page(XLogReaderState *record)
* special handling for init forks as create index operations don't log a
* full page image of the metapage.
*/
- XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(bitmapbuf);
UnlockReleaseBuffer(bitmapbuf);
@@ -113,7 +113,7 @@ hash_xlog_init_bitmap_page(XLogReaderState *record)
PageSetLSN(page, lsn);
MarkBufferDirty(metabuf);
- XLogRecGetBlockTag(record, 1, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(metabuf);
}
@@ -190,8 +190,8 @@ hash_xlog_add_ovfl_page(XLogReaderState *record)
Size datalen PG_USED_FOR_ASSERTS_ONLY;
bool new_bmpage = false;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &rightblk);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &leftblk);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &rightblk);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &leftblk);
ovflbuf = XLogInitBufferForRedo(record, 0);
Assert(BufferIsValid(ovflbuf));
@@ -1001,7 +1001,7 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
}
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index b7adfdb826e..f36ef902c56 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -427,7 +427,8 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(SMGR_MD,
+ &rel->rd_node,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1021,7 +1022,8 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(SMGR_MD,
+ &rel->rd_node,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 6f26ddac5f9..2385dae539b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7683,7 +7683,7 @@ heap_xlog_clean(XLogReaderState *record)
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
@@ -7778,7 +7778,7 @@ heap_xlog_visible(XLogReaderState *record)
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, NULL, &rnode, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -7926,7 +7926,7 @@ heap_xlog_freeze_page(XLogReaderState *record)
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
}
@@ -7998,7 +7998,7 @@ heap_xlog_delete(XLogReaderState *record)
RelFileNode target_node;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &target_node, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -8079,7 +8079,7 @@ heap_xlog_insert(XLogReaderState *record)
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &target_node, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -8201,7 +8201,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &blkno);
/*
* The visibility map may need to be fixed even if the heap page is
@@ -8347,8 +8347,8 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
- if (XLogRecGetBlockTag(record, 1, NULL, NULL, &oldblk))
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &newblk);
+ if (XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &oldblk))
{
/* HOT updates are never done across pages */
Assert(!hot_update);
@@ -8643,7 +8643,7 @@ heap_xlog_lock(XLogReaderState *record)
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &block);
reln = CreateFakeRelcacheEntry(rnode);
visibilitymap_pin(reln, block, &vmbuffer);
@@ -8716,7 +8716,7 @@ heap_xlog_lock_updated(XLogReaderState *record)
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &block);
reln = CreateFakeRelcacheEntry(rnode);
visibilitymap_pin(reln, block, &vmbuffer);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 4d179881f27..d7e5bdb701d 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -626,7 +626,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(SMGR_MD, *newrnode, rel->rd_backend);
RelationOpenSmgr(rel);
/*
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index bce4274362c..a06128ca58e 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -331,7 +331,8 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (state->rs_use_wal)
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(SMGR_MD,
+ &state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -696,7 +697,8 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (state->rs_use_wal)
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(SMGR_MD,
+ &state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
page,
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 02fb352b94a..16becdaf85e 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -172,7 +172,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 0b5be776d63..9051b0fcdca 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -658,7 +658,8 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the heap NEWPAGE record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(SMGR_MD, &wstate->index->rd_node, MAIN_FORKNUM, blkno,
+ page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index 0a85d8b535a..f257245dfa0 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -216,9 +216,9 @@ btree_xlog_split(bool onleft, XLogReaderState *record)
BlockNumber rightsib;
BlockNumber rnext;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &leftsib);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &rightsib);
- if (!XLogRecGetBlockTag(record, 2, NULL, NULL, &rnext))
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &leftsib);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &rightsib);
+ if (!XLogRecGetBlockTag(record, 2, NULL, NULL, NULL, &rnext))
rnext = P_NONE;
/*
@@ -524,7 +524,7 @@ btree_xlog_delete(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index b40bd440cf0..8019f6839de 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index ebe6ae8715b..3ce35feee69 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -151,7 +151,7 @@ spgRedoAddLeaf(XLogReaderState *record)
SpGistInnerTuple tuple;
BlockNumber blknoLeaf;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &blknoLeaf);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &blknoLeaf);
page = BufferGetPage(buffer);
@@ -184,7 +184,7 @@ spgRedoMoveLeafs(XLogReaderState *record)
XLogRedoAction action;
BlockNumber blknoDst;
- XLogRecGetBlockTag(record, 1, NULL, NULL, &blknoDst);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &blknoDst);
fillFakeState(&state, xldata->stateSrc);
@@ -328,8 +328,8 @@ spgRedoAddNode(XLogReaderState *record)
BlockNumber blkno;
BlockNumber blknoNew;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &blkno);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &blknoNew);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &blknoNew);
/*
* In normal operation we would have all three pages (source, dest,
@@ -549,7 +549,7 @@ spgRedoPickSplit(XLogReaderState *record)
BlockNumber blknoInner;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 2, NULL, NULL, &blknoInner);
+ XLogRecGetBlockTag(record, 2, NULL, NULL, NULL, &blknoInner);
fillFakeState(&state, xldata->stateSrc);
@@ -879,7 +879,7 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
RelFileNode node;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &node, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
node);
}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c00b63c751c..77f405b79a4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1373,6 +1373,7 @@ checkXLogConsistency(XLogReaderState *record)
ForkNumber forknum;
BlockNumber blkno;
int block_id;
+ SmgrId smgrid;
/* Records with no backup blocks have no need for consistency checks. */
if (!XLogRecHasAnyBlockRefs(record))
@@ -1385,7 +1386,8 @@ checkXLogConsistency(XLogReaderState *record)
Buffer buf;
Page page;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blkno))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -1410,7 +1412,7 @@ checkXLogConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno,
RBM_NORMAL_NO_LOG);
if (!BufferIsValid(buf))
continue;
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 1c76dcfa0dc..01a48054e8e 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -43,7 +43,8 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ SmgrId smgrid; /* identifies the SGMR, relation and block */
+ RelFileNode rnode;
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->smgrid, ®buf->rnode, ®buf->forkno,
+ ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -248,7 +250,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(regbuf_old->smgrid != regbuf->smgrid ||
+ !RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -263,8 +266,9 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
- BlockNumber blknum, Page page, uint8 flags)
+XLogRegisterBlock(uint8 block_id, SmgrId smgrid, RelFileNode *rnode,
+ ForkNumber forknum, BlockNumber blknum, Page page,
+ uint8 flags)
{
registered_buffer *regbuf;
@@ -280,6 +284,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
+ regbuf->smgrid = smgrid;
regbuf->rnode = *rnode;
regbuf->forkno = forknum;
regbuf->block = blknum;
@@ -303,7 +308,8 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(regbuf_old->smgrid != regbuf->smgrid ||
+ !RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -702,7 +708,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && regbuf->smgrid == prev_regbuf->smgrid &&
+ RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -727,6 +734,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
+ memcpy(scratch, ®buf->smgrid, sizeof(SmgrId));
+ scratch += sizeof(SmgrId);
memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
scratch += sizeof(RelFileNode);
}
@@ -919,6 +928,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -947,8 +957,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &smgrid, &rnode, &forkno, &blkno);
+ XLogRegisterBlock(0, smgrid, &rnode, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -969,8 +979,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
- Page page, bool page_std)
+log_newpage(SmgrId smgrid, RelFileNode *rnode, ForkNumber forkNum,
+ BlockNumber blkno, Page page, bool page_std)
{
int flags;
XLogRecPtr recptr;
@@ -980,7 +990,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, smgrid, rnode, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1009,6 +1019,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1016,9 +1027,9 @@ log_newpage_buffer(Buffer buffer, bool page_std)
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &smgrid, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(smgrid, &rnode, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 9196aa3aaef..de2bdd9d61a 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1056,6 +1056,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
uint32 remaining;
uint32 datatotal;
RelFileNode *rnode = NULL;
+ SmgrId smgrid = -1;
uint8 block_id;
ResetDecoder(state);
@@ -1229,8 +1230,10 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
+ COPY_HEADER_FIELD(&blk->smgrid, sizeof(SmgrId));
COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
rnode = &blk->rnode;
+ smgrid = blk->smgrid;
}
else
{
@@ -1242,6 +1245,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
goto err;
}
+ blk->smgrid = smgrid;
blk->rnode = *rnode;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
@@ -1355,6 +1359,7 @@ err:
*/
bool
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
+ SmgrId *smgrid,
RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
{
DecodedBkpBlock *bkpb;
@@ -1363,6 +1368,8 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->blocks[block_id];
+ if (smgrid)
+ *smgrid = bkpb->smgrid;
if (rnode)
*rnode = bkpb->rnode;
if (forknum)
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 10a663bae62..c5f27fb0e17 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -335,8 +335,9 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Page page;
bool zeromode;
bool willinit;
+ SmgrId smgrid;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum, &blkno))
{
/* Caller specified a bogus block_id */
elog(PANIC, "failed to locate backup block with ID %d", block_id);
@@ -357,7 +358,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK);
page = BufferGetPage(*buf);
if (!RestoreBlockImage(record, block_id, page))
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode);
+ *buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno, mode);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -434,7 +435,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(SmgrId smgrid, RelFileNode rnode, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode)
{
BlockNumber lastblock;
@@ -444,7 +445,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
Assert(blkno != P_NEW);
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(smgrid, rnode, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -461,7 +462,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum, blkno,
mode, NULL);
}
else
@@ -486,7 +487,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum,
P_NEW, mode, NULL);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -496,7 +497,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum, blkno,
mode, NULL);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index fb41f223ada..d356594ee96 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -102,7 +102,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(SMGR_MD, rnode, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
@@ -353,7 +353,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(SMGR_MD, &dst->smgr_rnode.node, forkNum, blkno, page,
+ false);
PageSetChecksumInplace(page, blkno);
@@ -428,7 +429,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(SMGR_MD, pending->relnode, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -580,7 +581,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(SMGR_MD, xlrec->rnode, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -589,7 +590,7 @@ smgr_redo(XLogReaderState *record)
SMgrRelation reln;
Relation rel;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(SMGR_MD, xlrec->rnode, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8e4743d1101..30a0c709548 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12533,7 +12533,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(SMGR_MD, newrnode, rel->rd_backend);
RelationOpenSmgr(rel);
/*
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index eec3a228429..24b6441a98d 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -683,7 +683,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -731,7 +731,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -796,7 +796,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -892,7 +892,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &rnode, NULL, NULL);
if (rnode.dbNode != ctx->slot->data.database)
return;
@@ -991,7 +991,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
RelFileNode target_node;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 2cfdf1c9ac9..4a946fd5534 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3496,6 +3496,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
ReorderBufferTupleCidEnt *ent;
ForkNumber forkno;
BlockNumber blockno;
+ SmgrId smgrid;
bool updated_mapping = false;
/* be careful about padding */
@@ -3507,7 +3508,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &smgrid, &key.relnode, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 887023fc8a5..1981d563b59 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -554,7 +554,8 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_which,
+ reln->rd_smgr->smgr_rnode.node,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -679,13 +680,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* parameters.
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(SmgrId smgrid, RelFileNode rnode, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(smgrid, rnode, InvalidBackendId);
Assert(InRecovery);
@@ -1008,7 +1009,8 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1842,6 +1844,7 @@ BufferSync(int flags)
buf_state |= BM_CHECKPOINT_NEEDED;
item = &CkptBufferIds[num_to_scan++];
+ item->smgrid = bufHdr->tag.smgrid;
item->buf_id = buf_id;
item->tsId = bufHdr->tag.rnode.spcNode;
item->relNode = bufHdr->tag.rnode.relNode;
@@ -2625,12 +2628,12 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
- * a buffer.
+ * Returns the SMGR ID, relfilenode, fork number and block number
+ * associated with a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
- BlockNumber *blknum)
+BufferGetTag(Buffer buffer, SmgrId *smgrid, RelFileNode *rnode,
+ ForkNumber *forknum, BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2643,6 +2646,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
+ *smgrid = bufHdr->tag.smgrid;
*rnode = bufHdr->tag.rnode;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
@@ -2694,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.smgrid, buf->tag.rnode, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -4183,6 +4187,11 @@ ckpt_buforder_comparator(const void *pa, const void *pb)
const CkptSortItem *a = (const CkptSortItem *) pa;
const CkptSortItem *b = (const CkptSortItem *) pb;
+ /* compare smgr */
+ if (a->smgrid < b->smgrid)
+ return -1;
+ else if (a->smgrid > b->smgrid)
+ return 1;
/* compare tablespace */
if (a->tsId < b->tsId)
return -1;
@@ -4340,7 +4349,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.smgrid, tag.rnode, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index c462ea82a92..896285a1d9f 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,8 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +112,8 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +211,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.smgrid, bufHdr->tag.rnode, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index eee82860575..b16810162e4 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -210,7 +210,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR);
+ buf = XLogReadBufferExtended(SMGR_MD, rnode, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f12dd..da3b286ca6c 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,11 +268,12 @@ restart:
*
* Fix the corruption and restart.
*/
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &smgrid, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 52ca6eeb28f..622315fbd16 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -120,7 +120,7 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
bool isRedo);
-static MdfdVec *mdopen(SMgrRelation reln, ForkNumber forknum, int behavior);
+static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
@@ -151,6 +151,17 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+/*
+ * mdopen() -- Initialize a newly-opened relation.
+ */
+void
+mdopen(SMgrRelation reln)
+{
+ /* mark it not open */
+ for (int forknum = 0; forknum <= MAX_FORKNUM; forknum++)
+ reln->md_num_open_segs[forknum] = 0;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -165,7 +176,7 @@ mdexists(SMgrRelation reln, ForkNumber forkNum)
*/
mdclose(reln, forkNum);
- return (mdopen(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
+ return (mdopenfork(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
}
/*
@@ -425,7 +436,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
}
/*
- * mdopen() -- Open the specified relation.
+ * mdopenfork() -- Open the specified relation.
*
* Note we only open the first segment, when there are multiple segments.
*
@@ -435,7 +446,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* invent one out of whole cloth.
*/
static MdfdVec *
-mdopen(SMgrRelation reln, ForkNumber forknum, int behavior)
+mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
{
MdfdVec *mdfd;
char *path;
@@ -713,11 +724,11 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopen(reln, forknum, EXTENSION_FAIL);
+ MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
- /* mdopen has opened the first segment */
+ /* mdopenfork has opened the first segment */
Assert(reln->md_num_open_segs[forknum] > 0);
/*
@@ -981,7 +992,7 @@ DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
srels = palloc(sizeof(SMgrRelation) * ndelrels);
for (i = 0; i < ndelrels; i++)
{
- SMgrRelation srel = smgropen(delrels[i], InvalidBackendId);
+ SMgrRelation srel = smgropen(SMGR_MD, delrels[i], InvalidBackendId);
if (isRedo)
{
@@ -1137,7 +1148,7 @@ _mdfd_getseg(SMgrRelation reln, ForkNumber forknum, BlockNumber blkno,
v = &reln->md_seg_fds[forknum][reln->md_num_open_segs[forknum] - 1];
else
{
- v = mdopen(reln, forknum, behavior);
+ v = mdopenfork(reln, forknum, behavior);
if (!v)
return NULL; /* if behavior & EXTENSION_RETURN_NULL */
}
@@ -1257,7 +1268,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(SMGR_MD, ftag->rnode, InvalidBackendId);
MdfdVec *v;
char *p;
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 8191118b619..ef3907f2a79 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -41,6 +41,7 @@ typedef struct f_smgr
{
void (*smgr_init) (void); /* may be NULL */
void (*smgr_shutdown) (void); /* may be NULL */
+ void (*smgr_open) (SMgrRelation reln);
void (*smgr_close) (SMgrRelation reln, ForkNumber forknum);
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
@@ -68,6 +69,7 @@ static const f_smgr smgrsw[] = {
{
.smgr_init = mdinit,
.smgr_shutdown = NULL,
+ .smgr_open = mdopen,
.smgr_close = mdclose,
.smgr_create = mdcreate,
.smgr_exists = mdexists,
@@ -141,7 +143,7 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(SmgrId smgrid, RelFileNode rnode, BackendId backend)
{
RelFileNodeBackend brnode;
SMgrRelation reln;
@@ -170,18 +172,15 @@ smgropen(RelFileNode rnode, BackendId backend)
/* Initialize it if not present before */
if (!found)
{
- int forknum;
-
/* hash_search already filled in the lookup key */
reln->smgr_owner = NULL;
reln->smgr_targblock = InvalidBlockNumber;
reln->smgr_fsm_nblocks = InvalidBlockNumber;
reln->smgr_vm_nblocks = InvalidBlockNumber;
- reln->smgr_which = 0; /* we only have md.c at present */
+ reln->smgr_which = smgrid;
- /* mark it not open */
- for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- reln->md_num_open_segs[forknum] = 0;
+ /* implementation-specific initialization */
+ smgrsw[reln->smgr_which].smgr_open(reln);
/* it has no owner yet */
dlist_push_tail(&unowned_relns, &reln->node);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 7709b96e008..7f0a179aaac 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -395,8 +395,14 @@ extractPageInfo(XLogReaderState *record)
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blkno;
+ SmgrId smgrid;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blkno))
+ continue;
+
+ /* TODO: How should we handle other smgr IDs? */
+ if (smgrid != SMGR_MD)
continue;
/* We only care about the main fork; others are copied in toto */
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f61505ade36..c3bf2b1b7f5 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -524,6 +524,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
const RmgrDescData *desc = &RmgrDescTable[XLogRecGetRmid(record)];
uint32 rec_len;
uint32 fpi_len;
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blk;
@@ -556,16 +557,19 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
if (!XLogRecHasBlockRef(record, block_id))
continue;
- XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
+ XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%u: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%u: smgr %d rel %u/%u/%u fork %s blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
else
- printf(", blkref #%u: rel %u/%u/%u blk %u",
+ printf(", blkref #%u: smgr %d rel %u/%u/%u blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
blk);
if (XLogRecHasBlockImage(record, block_id))
@@ -587,9 +591,11 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
if (!XLogRecHasBlockRef(record, block_id))
continue;
- XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%u: rel %u/%u/%u fork %s blk %u",
+ XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blk);
+ printf("\tblkref #%u: smgr %d rel %u/%u/%u fork %s blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index 30c4ff7bea1..5e3388c2156 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -16,6 +16,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/relfilenode.h"
+#include "storage/smgr.h"
#include "utils/relcache.h"
/*
@@ -45,15 +46,16 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int nbuffers, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
- ForkNumber forknum, BlockNumber blknum, char *page,
- uint8 flags);
+extern void XLogRegisterBlock(uint8 block_id, SmgrId smgrid,
+ RelFileNode *rnode, ForkNumber forknum, BlockNumber blknum,
+ char *page, uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
- BlockNumber blk, char *page, bool page_std);
+extern XLogRecPtr log_newpage(SmgrId smgrid, RelFileNode *rnode,
+ ForkNumber forkNum, BlockNumber blk,
+ char *page, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
BlockNumber startblk, BlockNumber endblk, bool page_std);
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index f3bae0bf492..70450d72d9e 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -26,6 +26,7 @@
#define XLOGREADER_H
#include "access/xlogrecord.h"
+#include "storage/smgr.h"
typedef struct XLogReaderState XLogReaderState;
@@ -43,6 +44,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blkno;
@@ -243,7 +245,7 @@ extern bool DecodeXLogRecord(XLogReaderState *state, XLogRecord *record,
extern bool RestoreBlockImage(XLogReaderState *recoder, uint8 block_id, char *dst);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern bool XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
- BlockNumber *blknum);
+ SmgrId *smgrid, RelFileNode *rnode,
+ ForkNumber *forknum, BlockNumber *blknum);
#endif /* XLOGREADER_H */
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index 0ab5ba62f51..4afb4cb79ae 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -13,6 +13,7 @@
#include "access/xlogreader.h"
#include "storage/bufmgr.h"
+#include "storage/smgr.h"
extern bool XLogHaveInvalidPages(void);
@@ -41,8 +42,9 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
- BlockNumber blkno, ReadBufferMode mode);
+extern Buffer XLogReadBufferExtended(SmgrId smgrid, RelFileNode rnode,
+ ForkNumber forknum,
+ BlockNumber blkno, ReadBufferMode mode);
extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
extern void FreeFakeRelcacheEntry(Relation fakerel);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ba1b5463fc3..bcf10d0eb55 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -87,16 +87,20 @@
*
* Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
* to be fixed to zero them, since this struct is used as a hash key.
+ * Conceptually the SmgrId should go first, but we put it next to the
+ * ForkNumber so that it packs better with typical alignment rules.
*/
typedef struct buftag
{
RelFileNode rnode; /* physical relation identifier */
- ForkNumber forkNum;
+ int16 smgrid; /* SmgrId */
+ int16 forkNum; /* ForkNumber */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
+ (a).smgrid = SMGR_INVALID, \
(a).rnode.spcNode = InvalidOid, \
(a).rnode.dbNode = InvalidOid, \
(a).rnode.relNode = InvalidOid, \
@@ -104,8 +108,9 @@ typedef struct buftag
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_smgrid,xx_rnode,xx_forkNum,xx_blockNum) \
( \
+ (a).smgrid = (xx_smgrid), \
(a).rnode = (xx_rnode), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
@@ -113,6 +118,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
+ (a).smgrid == (b).smgrid && \
RelFileNodeEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
@@ -288,6 +294,7 @@ extern BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
+ SmgrId smgrid;
Oid tsId;
Oid relNode;
ForkNumber forkNum;
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index c5826f691de..06e814bbad9 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -18,6 +18,7 @@
#include "storage/buf.h"
#include "storage/bufpage.h"
#include "storage/relfilenode.h"
+#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -168,7 +169,7 @@ extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
+extern Buffer ReadBufferWithoutRelcache(SmgrId smgrid, RelFileNode rnode,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy);
extern void ReleaseBuffer(Buffer buffer);
@@ -205,7 +206,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, SmgrId *smgrid, RelFileNode *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index a6758a10dcb..ae0cce09bad 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -21,6 +21,7 @@
/* md storage manager functionality */
extern void mdinit(void);
+extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 770193e285e..b9d9ca163a0 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -79,8 +79,14 @@ typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+typedef enum SmgrId
+{
+ SMGR_INVALID = -1,
+ SMGR_MD = 0, /* md.c */
+} SmgrId;
+
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(SmgrId which, RelFileNode rnode, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d7f33abce3f..c6e516d6ecd 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -52,6 +52,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
+ SmgrId rd_smgrid; /* relation storage manager */
RelFileNode rd_node; /* relation physical identifier */
/* use "struct" here to avoid needing to include smgr.h: */
struct SMgrRelationData *rd_smgr; /* cached file handle, or NULL */
@@ -471,7 +472,10 @@ typedef struct ViewOptions
#define RelationOpenSmgr(relation) \
do { \
if ((relation)->rd_smgr == NULL) \
- smgrsetowner(&((relation)->rd_smgr), smgropen((relation)->rd_node, (relation)->rd_backend)); \
+ smgrsetowner(&((relation)->rd_smgr), \
+ smgropen((relation)->rd_smgrid, \
+ (relation)->rd_node, \
+ (relation)->rd_backend)); \
} while (0)
/*
--
2.21.0
On Wed, May 08, 2019 at 06:31:04PM +1200, Thomas Munro wrote:
The questions are: how should buffer tags distinguish different kinds
of buffers, and how should SMGR direct IO traffic to the right place
when it needs to schlepp pages in and out?In earlier prototype code, I'd been using a special database number
for undo logs. In a recent thread[1], Tom and others didn't like that
idea much, and Shawn mentioned his colleague's idea of stealing unused
bits from the fork number so that there is no net change in tag size,
but we have entirely separate namespaces for each kind of buffered
data.Here's a patch that does that, and then makes changes in the main
places I have found so far that need to be aware of the new SMGR ID
field.
Looks good to me. Minor nit: update the comment for XLogRecGetBlockTag:
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 9196aa3aae..9ee086f00b 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1349,12 +1353,13 @@ err:
/*
* Returns information about the block that a block reference refers to.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
+ * If the WAL record contains a block reference with the given ID, *smgrid, *rnode,
* *forknum, and *blknum are filled in (if not NULL), and returns true.
* Otherwise returns false.
*/
bool
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
+ SmgrId *smgrid,
RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
{
DecodedBkpBlock *bkpb;
--
Shawn Debnath
Amazon Web Services (AWS)
On Fri, May 10, 2019 at 8:54 AM Shawn Debnath <sdn@amazon.com> wrote:
On Wed, May 08, 2019 at 06:31:04PM +1200, Thomas Munro wrote:
Looks good to me. Minor nit: update the comment for XLogRecGetBlockTag:
Fixed. Also fixed broken upgrade scripts for pg_buffercache
extension, as pointed out by Robert[1]/messages/by-id/CA+Tgmob4htT-9Tq7eHG3wS=dpKFbQZOyqgSr1iWmV_65Duz6Pw@mail.gmail.com on the main thread where undo
stuff is being discussed. Attempts to keep subtopics separated have so
far failed, so the thread ostensibly about orphaned file cleanup is
now about undo work allocation, but I figured it'd be useful to
highlight this patch separately as it'll be the first to go in, and
it's needed by your work Shawn. So I hope we're still on the same
page with this refactoring patch.
One thing I'm not sure about is the TODO message in parsexlog.c's
extractPageInfo() function.
[1]: /messages/by-id/CA+Tgmob4htT-9Tq7eHG3wS=dpKFbQZOyqgSr1iWmV_65Duz6Pw@mail.gmail.com
--
Thomas Munro
https://enterprisedb.com
Attachments:
0002-Move-tablespace-dir-creation-from-smgr.c-to-md.c.patchapplication/octet-stream; name=0002-Move-tablespace-dir-creation-from-smgr.c-to-md.c.patchDownload
From 4d88b4d1fe75234b5568fd1eb1056ac222e1c9c4 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 30 Apr 2019 22:11:03 +1200
Subject: [PATCH 02/21] Move tablespace dir creation from smgr.c to md.c.
For undo logs, we don't need to create tablespace directories when
opening a relation, because that is managed automatically by
undolog.c.
Author: Thomas Munro
---
src/backend/storage/smgr/md.c | 14 ++++++++++++++
src/backend/storage/smgr/smgr.c | 14 --------------
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index b2c42cf8f0a..a426e2d36bd 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,6 +28,7 @@
#include "miscadmin.h"
#include "access/xlogutils.h"
#include "access/xlog.h"
+#include "commands/tablespace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
@@ -196,6 +197,19 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
Assert(reln->md_num_open_segs[forkNum] == 0);
+ /*
+ * We may be using the target table space for the first time in this
+ * database, so create a per-database subdirectory if needed.
+ *
+ * XXX this is a fairly ugly violation of module layering, but this seems
+ * to be the best place to put the check. Maybe TablespaceCreateDbspace
+ * should be here and not in commands/tablespace.c? But that would imply
+ * importing a lot of stuff that smgr.c oughtn't know, either.
+ */
+ TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
+ reln->smgr_rnode.node.dbNode,
+ isRedo);
+
path = relpath(reln->smgr_rnode, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 26281fab51d..4ba07a08f54 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -17,7 +17,6 @@
*/
#include "postgres.h"
-#include "commands/tablespace.h"
#include "lib/ilist.h"
#include "storage/bufmgr.h"
#include "storage/ipc.h"
@@ -343,19 +342,6 @@ smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
if (isRedo && reln->md_num_open_segs[forknum] > 0)
return;
- /*
- * We may be using the target table space for the first time in this
- * database, so create a per-database subdirectory if needed.
- *
- * XXX this is a fairly ugly violation of module layering, but this seems
- * to be the best place to put the check. Maybe TablespaceCreateDbspace
- * should be here and not in commands/tablespace.c? But that would imply
- * importing a lot of stuff that smgr.c oughtn't know, either.
- */
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- isRedo);
-
smgrsw[reln->smgr_which].smgr_create(reln, forknum, isRedo);
}
--
2.21.0
0001-Add-SmgrId-to-smgropen-and-BufferTag.patchapplication/octet-stream; name=0001-Add-SmgrId-to-smgropen-and-BufferTag.patchDownload
From 4fa9108abaf9e85a99141a61fe9233fc4f548a6d Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 8 Mar 2019 12:03:00 +1300
Subject: [PATCH 01/21] Add SmgrId to smgropen() and BufferTag.
To use bufmgr.c for new kinds of data in addition to plain old
relations, add an SMGR argument to places that identify blocks
and the files that hold them (smgropen(), block references in
the WAL, BufferTag).
To avoid making BufferTag wider, take some space away from the
fork number for this new member, since there are just a few
values possible (a suggestion from Anton Shyrabokau).
Add a "smgrid" column to the pg_buffercache extension.
Create a new callback for smgropen() calls so that some md.c-
specific stuff can move out of smgropen(), and future
implementations can also run their own initialization code.
Author: Thomas Munro
Reviewed-by: Robert Haas, Shawn Debnath
Discussion: https://postgr.es/m/CA%2BhUKG%2BOZqOiOuDm5tC5DyQZtJ3FH4%2BFSVMqtdC4P1atpJ%2Bqhg%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
---
contrib/bloom/blinsert.c | 2 +-
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.2.sql | 21 ----------
.../pg_buffercache--1.3--1.4.sql | 36 ++++++++++++++++
.../pg_buffercache/pg_buffercache--1.4.sql | 41 +++++++++++++++++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 15 +++++--
doc/src/sgml/pgbuffercache.sgml | 7 ++++
src/backend/access/brin/brin_xlog.c | 2 +-
src/backend/access/gin/ginxlog.c | 3 +-
src/backend/access/gist/gistxlog.c | 6 +--
src/backend/access/hash/hash_xlog.c | 12 +++---
src/backend/access/hash/hashpage.c | 6 ++-
src/backend/access/heap/heapam.c | 20 ++++-----
src/backend/access/heap/heapam_handler.c | 2 +-
src/backend/access/heap/rewriteheap.c | 6 ++-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 3 +-
src/backend/access/nbtree/nbtxlog.c | 8 ++--
src/backend/access/spgist/spginsert.c | 6 +--
src/backend/access/spgist/spgxlog.c | 12 +++---
src/backend/access/transam/xlog.c | 6 ++-
src/backend/access/transam/xloginsert.c | 39 +++++++++++-------
src/backend/access/transam/xlogreader.c | 11 ++++-
src/backend/access/transam/xlogutils.c | 17 ++++----
src/backend/catalog/storage.c | 11 ++---
src/backend/commands/tablecmds.c | 2 +-
src/backend/replication/logical/decode.c | 10 ++---
.../replication/logical/reorderbuffer.c | 3 +-
src/backend/storage/buffer/bufmgr.c | 29 ++++++++-----
src/backend/storage/buffer/localbuf.c | 8 ++--
src/backend/storage/freespace/freespace.c | 3 +-
src/backend/storage/freespace/fsmpage.c | 3 +-
src/backend/storage/smgr/md.c | 29 +++++++++----
src/backend/storage/smgr/smgr.c | 13 +++---
src/bin/pg_rewind/parsexlog.c | 8 +++-
src/bin/pg_waldump/pg_waldump.c | 16 +++++---
src/include/access/xloginsert.h | 13 +++---
src/include/access/xlogreader.h | 6 ++-
src/include/access/xlogutils.h | 4 +-
src/include/storage/buf_internals.h | 11 ++++-
src/include/storage/bufmgr.h | 9 ++--
src/include/storage/md.h | 1 +
src/include/storage/smgr.h | 8 +++-
src/include/utils/rel.h | 6 ++-
45 files changed, 323 insertions(+), 159 deletions(-)
delete mode 100644 contrib/pg_buffercache/pg_buffercache--1.2.sql
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4.sql
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 4b2186b8dda..e39d21df1f6 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -181,7 +181,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 18f7a874524..d76ac243d3e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -4,7 +4,9 @@ MODULE_big = pg_buffercache
OBJS = pg_buffercache_pages.o $(WIN32RES)
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
+DATA = \
+ pg_buffercache--1.4.sql \
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
pg_buffercache--unpackaged--1.0.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.2.sql
deleted file mode 100644
index 6ee5d8435bd..00000000000
--- a/contrib/pg_buffercache/pg_buffercache--1.2.sql
+++ /dev/null
@@ -1,21 +0,0 @@
-/* contrib/pg_buffercache/pg_buffercache--1.2.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION pg_buffercache" to load this file. \quit
-
--- Register the function.
-CREATE FUNCTION pg_buffercache_pages()
-RETURNS SETOF RECORD
-AS 'MODULE_PATHNAME', 'pg_buffercache_pages'
-LANGUAGE C PARALLEL SAFE;
-
--- Create a view for convenient access.
-CREATE VIEW pg_buffercache AS
- SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
- relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
- pinning_backends int4);
-
--- Don't want these to be available to public.
-REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
-REVOKE ALL ON pg_buffercache FROM PUBLIC;
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 00000000000..ab6d20a5ccf
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,36 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+DROP VIEW pg_buffercache;
+
+CREATE VIEW pg_buffercache AS
+ SELECT bufferid,
+ smgrid,
+ relfilenode,
+ reltablespace,
+ reldatabase,
+ relforknumber,
+ relblocknumber,
+ isdirty,
+ usagecount,
+ pinning_backends
+ FROM pg_buffercache_pages() AS P(
+ bufferid integer,
+ relfilenode oid,
+ reltablespace oid,
+ reldatabase oid,
+ relforknumber int2,
+ relblocknumber int8,
+ isdirty bool,
+ usagecount int2,
+ pinning_backends int4,
+ smgrid int2);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.4.sql
new file mode 100644
index 00000000000..9ae167abf0e
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4.sql
@@ -0,0 +1,41 @@
+/* contrib/pg_buffercache/pg_buffercache--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_buffercache" to load this file. \quit
+
+-- Register the function.
+CREATE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages'
+LANGUAGE C PARALLEL SAFE;
+
+-- Create a view for convenient access.
+CREATE VIEW pg_buffercache AS
+ SELECT bufferid,
+ smgrid,
+ relfilenode,
+ reltablespace,
+ reldatabase,
+ relforknumber,
+ relblocknumber,
+ isdirty,
+ usagecount,
+ pinning_backends
+ FROM pg_buffercache_pages() AS P(
+ bufferid integer,
+ relfilenode oid,
+ reltablespace oid,
+ reldatabase oid,
+ relforknumber int2,
+ relblocknumber int8,
+ isdirty bool,
+ usagecount int2,
+ pinning_backends int4,
+ smgrid int2);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae9abf..a82ae5f9bb5 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579fcbb0..2754c1e40e9 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -16,7 +16,7 @@
#define NUM_BUFFERCACHE_PAGES_MIN_ELEM 8
-#define NUM_BUFFERCACHE_PAGES_ELEM 9
+#define NUM_BUFFERCACHE_PAGES_ELEM 10
PG_MODULE_MAGIC;
@@ -25,6 +25,7 @@ PG_MODULE_MAGIC;
*/
typedef struct
{
+ SmgrId smgrid;
uint32 bufferid;
Oid relfilenode;
Oid reltablespace;
@@ -116,10 +117,12 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
BOOLOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 8, "usage_count",
INT2OID, -1, 0);
-
- if (expected_tupledesc->natts == NUM_BUFFERCACHE_PAGES_ELEM)
+ if (expected_tupledesc->natts >= 9)
TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
INT4OID, -1, 0);
+ if (expected_tupledesc->natts >= 10)
+ TupleDescInitEntry(tupledesc, (AttrNumber) 10, "smgrid",
+ INT2OID, -1, 0);
fctx->tupdesc = BlessTupleDesc(tupledesc);
@@ -153,6 +156,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
+ fctx->record[i].smgrid = bufHdr->tag.smgrid;
fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
@@ -206,6 +210,8 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
nulls[7] = true;
/* unused for v1.0 callers, but the array is always long enough */
nulls[8] = true;
+ /* unused for < v1.4 callers, but the array is always long enough */
+ nulls[9] = true;
}
else
{
@@ -226,6 +232,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
/* unused for v1.0 callers, but the array is always long enough */
values[8] = Int32GetDatum(fctx->record[i].pinning_backends);
nulls[8] = false;
+ /* unused for < v1.4 callers, but the array is always long enough */
+ values[9] = Int16GetDatum(fctx->record[i].smgrid);
+ nulls[9] = false;
}
/* Build and return the tuple. */
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index faf5a3115dc..a0a7be32b4b 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -57,6 +57,13 @@
<entry>ID, in the range 1..<varname>shared_buffers</varname></entry>
</row>
+ <row>
+ <entry><structfield>smgrid</structfield></entry>
+ <entry><type>smallint</type></entry>
+ <entry></entry>
+ <entry>Block storage manager ID. 0 for regular relation data.</entry>
+ </row>
+
<row>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c
index db1f47ca218..a13b3cd2575 100644
--- a/src/backend/access/brin/brin_xlog.c
+++ b/src/backend/access/brin/brin_xlog.c
@@ -217,7 +217,7 @@ brin_xlog_revmap_extend(XLogReaderState *record)
xlrec = (xl_brin_revmap_extend *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &targetBlk);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &targetBlk);
Assert(xlrec->targetBlk == targetBlk);
/* Update the metapage */
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b282721..261881c4184 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,11 +95,12 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
+ SmgrId smgrid;
RelFileNode node;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &smgrid, &node, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
node.spcNode, node.dbNode, node.relNode);
}
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index 503db34d863..bf945b9fb50 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -193,7 +193,7 @@ gistRedoDeleteRecord(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
}
@@ -278,7 +278,7 @@ gistRedoPageSplitRecord(XLogReaderState *record)
BlockNumber blkno;
IndexTuple *tuples;
- XLogRecGetBlockTag(record, i + 1, NULL, NULL, &blkno);
+ XLogRecGetBlockTag(record, i + 1, NULL, NULL, NULL, &blkno);
if (blkno == GIST_ROOT_BLKNO)
{
Assert(i == 0);
@@ -313,7 +313,7 @@ gistRedoPageSplitRecord(XLogReaderState *record)
{
BlockNumber nextblkno;
- XLogRecGetBlockTag(record, i + 2, NULL, NULL, &nextblkno);
+ XLogRecGetBlockTag(record, i + 2, NULL, NULL, NULL, &nextblkno);
GistPageGetOpaque(page)->rightlink = nextblkno;
}
else
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index d7b70981101..ec604a7d428 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -51,7 +51,7 @@ hash_xlog_init_meta_page(XLogReaderState *record)
* special handling for init forks as create index operations don't log a
* full page image of the metapage.
*/
- XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(metabuf);
@@ -89,7 +89,7 @@ hash_xlog_init_bitmap_page(XLogReaderState *record)
* special handling for init forks as create index operations don't log a
* full page image of the metapage.
*/
- XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(bitmapbuf);
UnlockReleaseBuffer(bitmapbuf);
@@ -113,7 +113,7 @@ hash_xlog_init_bitmap_page(XLogReaderState *record)
PageSetLSN(page, lsn);
MarkBufferDirty(metabuf);
- XLogRecGetBlockTag(record, 1, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(metabuf);
}
@@ -190,8 +190,8 @@ hash_xlog_add_ovfl_page(XLogReaderState *record)
Size datalen PG_USED_FOR_ASSERTS_ONLY;
bool new_bmpage = false;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &rightblk);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &leftblk);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &rightblk);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &leftblk);
ovflbuf = XLogInitBufferForRedo(record, 0);
Assert(BufferIsValid(ovflbuf));
@@ -1001,7 +1001,7 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
}
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 376ee2a63b5..f6042e42f91 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -427,7 +427,8 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(SMGR_MD,
+ &rel->rd_node,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1021,7 +1022,8 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(SMGR_MD,
+ &rel->rd_node,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index d768b9b061c..96e732cd12f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7725,7 +7725,7 @@ heap_xlog_clean(XLogReaderState *record)
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
@@ -7820,7 +7820,7 @@ heap_xlog_visible(XLogReaderState *record)
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, NULL, &rnode, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -7968,7 +7968,7 @@ heap_xlog_freeze_page(XLogReaderState *record)
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
}
@@ -8040,7 +8040,7 @@ heap_xlog_delete(XLogReaderState *record)
RelFileNode target_node;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &target_node, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -8121,7 +8121,7 @@ heap_xlog_insert(XLogReaderState *record)
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &target_node, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -8243,7 +8243,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &blkno);
/*
* The visibility map may need to be fixed even if the heap page is
@@ -8389,8 +8389,8 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
- if (XLogRecGetBlockTag(record, 1, NULL, NULL, &oldblk))
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &newblk);
+ if (XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &oldblk))
{
/* HOT updates are never done across pages */
Assert(!hot_update);
@@ -8685,7 +8685,7 @@ heap_xlog_lock(XLogReaderState *record)
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &block);
reln = CreateFakeRelcacheEntry(rnode);
visibilitymap_pin(reln, block, &vmbuffer);
@@ -8758,7 +8758,7 @@ heap_xlog_lock_updated(XLogReaderState *record)
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &block);
reln = CreateFakeRelcacheEntry(rnode);
visibilitymap_pin(reln, block, &vmbuffer);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe98a7..39afef1d3cf 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -634,7 +634,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(SMGR_MD, *newrnode, rel->rd_backend);
RelationOpenSmgr(rel);
/*
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 72a448ad316..f47df39bd28 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -331,7 +331,8 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (state->rs_use_wal)
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(SMGR_MD,
+ &state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -696,7 +697,8 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (state->rs_use_wal)
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(SMGR_MD,
+ &state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
page,
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd5289ad7..f2ce02f07c6 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -172,7 +172,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index d0b9013caf4..931d6cd32b2 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -658,7 +658,8 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the heap NEWPAGE record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(SMGR_MD, &wstate->index->rd_node, MAIN_FORKNUM, blkno,
+ page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index 3147ea47268..1c074f77ac3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -216,9 +216,9 @@ btree_xlog_split(bool onleft, XLogReaderState *record)
BlockNumber rightsib;
BlockNumber rnext;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &leftsib);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &rightsib);
- if (!XLogRecGetBlockTag(record, 2, NULL, NULL, &rnext))
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &leftsib);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &rightsib);
+ if (!XLogRecGetBlockTag(record, 2, NULL, NULL, NULL, &rnext))
rnext = P_NONE;
/*
@@ -524,7 +524,7 @@ btree_xlog_delete(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index b40bd440cf0..8019f6839de 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index ebe6ae8715b..3ce35feee69 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -151,7 +151,7 @@ spgRedoAddLeaf(XLogReaderState *record)
SpGistInnerTuple tuple;
BlockNumber blknoLeaf;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &blknoLeaf);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &blknoLeaf);
page = BufferGetPage(buffer);
@@ -184,7 +184,7 @@ spgRedoMoveLeafs(XLogReaderState *record)
XLogRedoAction action;
BlockNumber blknoDst;
- XLogRecGetBlockTag(record, 1, NULL, NULL, &blknoDst);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &blknoDst);
fillFakeState(&state, xldata->stateSrc);
@@ -328,8 +328,8 @@ spgRedoAddNode(XLogReaderState *record)
BlockNumber blkno;
BlockNumber blknoNew;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &blkno);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &blknoNew);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &blknoNew);
/*
* In normal operation we would have all three pages (source, dest,
@@ -549,7 +549,7 @@ spgRedoPickSplit(XLogReaderState *record)
BlockNumber blknoInner;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 2, NULL, NULL, &blknoInner);
+ XLogRecGetBlockTag(record, 2, NULL, NULL, NULL, &blknoInner);
fillFakeState(&state, xldata->stateSrc);
@@ -879,7 +879,7 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
RelFileNode node;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &node, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
node);
}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b6c9353cbd2..edcbbf0bc94 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1372,6 +1372,7 @@ checkXLogConsistency(XLogReaderState *record)
ForkNumber forknum;
BlockNumber blkno;
int block_id;
+ SmgrId smgrid;
/* Records with no backup blocks have no need for consistency checks. */
if (!XLogRecHasAnyBlockRefs(record))
@@ -1384,7 +1385,8 @@ checkXLogConsistency(XLogReaderState *record)
Buffer buf;
Page page;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blkno))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -1409,7 +1411,7 @@ checkXLogConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno,
RBM_NORMAL_NO_LOG);
if (!BufferIsValid(buf))
continue;
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d468b5..1697797bc08 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -43,7 +43,8 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ SmgrId smgrid; /* identifies the SGMR, relation and block */
+ RelFileNode rnode;
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->smgrid, ®buf->rnode, ®buf->forkno,
+ ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -248,7 +250,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(regbuf_old->smgrid != regbuf->smgrid ||
+ !RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -263,8 +266,9 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
- BlockNumber blknum, Page page, uint8 flags)
+XLogRegisterBlock(uint8 block_id, SmgrId smgrid, RelFileNode *rnode,
+ ForkNumber forknum, BlockNumber blknum, Page page,
+ uint8 flags)
{
registered_buffer *regbuf;
@@ -280,6 +284,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
+ regbuf->smgrid = smgrid;
regbuf->rnode = *rnode;
regbuf->forkno = forknum;
regbuf->block = blknum;
@@ -303,7 +308,8 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(regbuf_old->smgrid != regbuf->smgrid ||
+ !RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -702,7 +708,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && regbuf->smgrid == prev_regbuf->smgrid &&
+ RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -727,6 +734,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
+ memcpy(scratch, ®buf->smgrid, sizeof(SmgrId));
+ scratch += sizeof(SmgrId);
memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
scratch += sizeof(RelFileNode);
}
@@ -919,6 +928,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -947,8 +957,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &smgrid, &rnode, &forkno, &blkno);
+ XLogRegisterBlock(0, smgrid, &rnode, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -969,8 +979,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
- Page page, bool page_std)
+log_newpage(SmgrId smgrid, RelFileNode *rnode, ForkNumber forkNum,
+ BlockNumber blkno, Page page, bool page_std)
{
int flags;
XLogRecPtr recptr;
@@ -980,7 +990,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, smgrid, rnode, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1009,6 +1019,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1016,9 +1027,9 @@ log_newpage_buffer(Buffer buffer, bool page_std)
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &smgrid, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(smgrid, &rnode, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 41dae916b46..1ce989c20fd 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1046,6 +1046,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
uint32 remaining;
uint32 datatotal;
RelFileNode *rnode = NULL;
+ SmgrId smgrid = -1;
uint8 block_id;
ResetDecoder(state);
@@ -1219,8 +1220,10 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
+ COPY_HEADER_FIELD(&blk->smgrid, sizeof(SmgrId));
COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
rnode = &blk->rnode;
+ smgrid = blk->smgrid;
}
else
{
@@ -1232,6 +1235,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
goto err;
}
+ blk->smgrid = smgrid;
blk->rnode = *rnode;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
@@ -1339,12 +1343,13 @@ err:
/*
* Returns information about the block that a block reference refers to.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
- * *forknum, and *blknum are filled in (if not NULL), and returns true.
+ * If the WAL record contains a block reference with the given ID, *smgrid,
+ * *rnode, *forknum and *blknum are filled in (if not NULL), and returns true.
* Otherwise returns false.
*/
bool
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
+ SmgrId *smgrid,
RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
{
DecodedBkpBlock *bkpb;
@@ -1353,6 +1358,8 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->blocks[block_id];
+ if (smgrid)
+ *smgrid = bkpb->smgrid;
if (rnode)
*rnode = bkpb->rnode;
if (forknum)
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 10a663bae62..c5f27fb0e17 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -335,8 +335,9 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Page page;
bool zeromode;
bool willinit;
+ SmgrId smgrid;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum, &blkno))
{
/* Caller specified a bogus block_id */
elog(PANIC, "failed to locate backup block with ID %d", block_id);
@@ -357,7 +358,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK);
page = BufferGetPage(*buf);
if (!RestoreBlockImage(record, block_id, page))
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode);
+ *buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno, mode);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -434,7 +435,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(SmgrId smgrid, RelFileNode rnode, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode)
{
BlockNumber lastblock;
@@ -444,7 +445,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
Assert(blkno != P_NEW);
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(smgrid, rnode, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -461,7 +462,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum, blkno,
mode, NULL);
}
else
@@ -486,7 +487,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum,
P_NEW, mode, NULL);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -496,7 +497,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum, blkno,
mode, NULL);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f7fe2..9509c197786 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -102,7 +102,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(SMGR_MD, rnode, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
@@ -353,7 +353,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(SMGR_MD, &dst->smgr_rnode.node, forkNum, blkno, page,
+ false);
PageSetChecksumInplace(page, blkno);
@@ -428,7 +429,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(SMGR_MD, pending->relnode, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -580,7 +581,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(SMGR_MD, xlrec->rnode, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -589,7 +590,7 @@ smgr_redo(XLogReaderState *record)
SMgrRelation reln;
Relation rel;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(SMGR_MD, xlrec->rnode, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0f1a9f0e548..2896df0d2ba 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12623,7 +12623,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(SMGR_MD, newrnode, rel->rd_backend);
RelationOpenSmgr(rel);
/*
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 151c3ef8825..3e96d2a0752 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -683,7 +683,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -731,7 +731,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -796,7 +796,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -892,7 +892,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &rnode, NULL, NULL);
if (rnode.dbNode != ctx->slot->data.database)
return;
@@ -991,7 +991,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
RelFileNode target_node;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 591377d2cd7..5cd44970110 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3496,6 +3496,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
ReorderBufferTupleCidEnt *ent;
ForkNumber forkno;
BlockNumber blockno;
+ SmgrId smgrid;
bool updated_mapping = false;
/* be careful about padding */
@@ -3507,7 +3508,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &smgrid, &key.relnode, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7332e6b5903..8046334c253 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -555,7 +555,8 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_which,
+ reln->rd_smgr->smgr_rnode.node,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -680,13 +681,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* parameters.
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(SmgrId smgrid, RelFileNode rnode, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(smgrid, rnode, InvalidBackendId);
Assert(InRecovery);
@@ -1009,7 +1010,8 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1843,6 +1845,7 @@ BufferSync(int flags)
buf_state |= BM_CHECKPOINT_NEEDED;
item = &CkptBufferIds[num_to_scan++];
+ item->smgrid = bufHdr->tag.smgrid;
item->buf_id = buf_id;
item->tsId = bufHdr->tag.rnode.spcNode;
item->relNode = bufHdr->tag.rnode.relNode;
@@ -2626,12 +2629,12 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
- * a buffer.
+ * Returns the SMGR ID, relfilenode, fork number and block number
+ * associated with a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
- BlockNumber *blknum)
+BufferGetTag(Buffer buffer, SmgrId *smgrid, RelFileNode *rnode,
+ ForkNumber *forknum, BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2644,6 +2647,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
+ *smgrid = bufHdr->tag.smgrid;
*rnode = bufHdr->tag.rnode;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
@@ -2695,7 +2699,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.smgrid, buf->tag.rnode, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -4220,6 +4224,11 @@ ckpt_buforder_comparator(const void *pa, const void *pb)
const CkptSortItem *a = (const CkptSortItem *) pa;
const CkptSortItem *b = (const CkptSortItem *) pb;
+ /* compare smgr */
+ if (a->smgrid < b->smgrid)
+ return -1;
+ else if (a->smgrid > b->smgrid)
+ return 1;
/* compare tablespace */
if (a->tsId < b->tsId)
return -1;
@@ -4377,7 +4386,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.smgrid, tag.rnode, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29222b..79a185e034f 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,8 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +112,8 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +211,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.smgrid, bufHdr->tag.rnode, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index c17b3f49dd0..78a2274e55c 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -210,7 +210,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR);
+ buf = XLogReadBufferExtended(SMGR_MD, rnode, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f12dd..da3b286ca6c 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,11 +268,12 @@ restart:
*
* Fix the corruption and restart.
*/
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &smgrid, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 58c94e9257a..b2c42cf8f0a 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -120,7 +120,7 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
bool isRedo);
-static MdfdVec *mdopen(SMgrRelation reln, ForkNumber forknum, int behavior);
+static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
@@ -151,6 +151,17 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+/*
+ * mdopen() -- Initialize a newly-opened relation.
+ */
+void
+mdopen(SMgrRelation reln)
+{
+ /* mark it not open */
+ for (int forknum = 0; forknum <= MAX_FORKNUM; forknum++)
+ reln->md_num_open_segs[forknum] = 0;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -165,7 +176,7 @@ mdexists(SMgrRelation reln, ForkNumber forkNum)
*/
mdclose(reln, forkNum);
- return (mdopen(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
+ return (mdopenfork(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
}
/*
@@ -425,7 +436,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
}
/*
- * mdopen() -- Open the specified relation.
+ * mdopenfork() -- Open the specified relation.
*
* Note we only open the first segment, when there are multiple segments.
*
@@ -435,7 +446,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* invent one out of whole cloth.
*/
static MdfdVec *
-mdopen(SMgrRelation reln, ForkNumber forknum, int behavior)
+mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
{
MdfdVec *mdfd;
char *path;
@@ -713,11 +724,11 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopen(reln, forknum, EXTENSION_FAIL);
+ MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
- /* mdopen has opened the first segment */
+ /* mdopenfork has opened the first segment */
Assert(reln->md_num_open_segs[forknum] > 0);
/*
@@ -981,7 +992,7 @@ DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
srels = palloc(sizeof(SMgrRelation) * ndelrels);
for (i = 0; i < ndelrels; i++)
{
- SMgrRelation srel = smgropen(delrels[i], InvalidBackendId);
+ SMgrRelation srel = smgropen(SMGR_MD, delrels[i], InvalidBackendId);
if (isRedo)
{
@@ -1137,7 +1148,7 @@ _mdfd_getseg(SMgrRelation reln, ForkNumber forknum, BlockNumber blkno,
v = &reln->md_seg_fds[forknum][reln->md_num_open_segs[forknum] - 1];
else
{
- v = mdopen(reln, forknum, behavior);
+ v = mdopenfork(reln, forknum, behavior);
if (!v)
return NULL; /* if behavior & EXTENSION_RETURN_NULL */
}
@@ -1254,7 +1265,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(SMGR_MD, ftag->rnode, InvalidBackendId);
MdfdVec *v;
char *p;
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index dba8c397feb..26281fab51d 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -41,6 +41,7 @@ typedef struct f_smgr
{
void (*smgr_init) (void); /* may be NULL */
void (*smgr_shutdown) (void); /* may be NULL */
+ void (*smgr_open) (SMgrRelation reln);
void (*smgr_close) (SMgrRelation reln, ForkNumber forknum);
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
@@ -68,6 +69,7 @@ static const f_smgr smgrsw[] = {
{
.smgr_init = mdinit,
.smgr_shutdown = NULL,
+ .smgr_open = mdopen,
.smgr_close = mdclose,
.smgr_create = mdcreate,
.smgr_exists = mdexists,
@@ -141,7 +143,7 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(SmgrId smgrid, RelFileNode rnode, BackendId backend)
{
RelFileNodeBackend brnode;
SMgrRelation reln;
@@ -170,18 +172,15 @@ smgropen(RelFileNode rnode, BackendId backend)
/* Initialize it if not present before */
if (!found)
{
- int forknum;
-
/* hash_search already filled in the lookup key */
reln->smgr_owner = NULL;
reln->smgr_targblock = InvalidBlockNumber;
reln->smgr_fsm_nblocks = InvalidBlockNumber;
reln->smgr_vm_nblocks = InvalidBlockNumber;
- reln->smgr_which = 0; /* we only have md.c at present */
+ reln->smgr_which = smgrid;
- /* mark it not open */
- for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- reln->md_num_open_segs[forknum] = 0;
+ /* implementation-specific initialization */
+ smgrsw[reln->smgr_which].smgr_open(reln);
/* it has no owner yet */
dlist_push_tail(&unowned_relns, &reln->node);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 287af60c4e7..44f4c418916 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -394,8 +394,14 @@ extractPageInfo(XLogReaderState *record)
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blkno;
+ SmgrId smgrid;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blkno))
+ continue;
+
+ /* TODO: How should we handle other smgr IDs? */
+ if (smgrid != SMGR_MD)
continue;
/* We only care about the main fork; others are copied in toto */
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index b95d467805a..9b9b4502b3b 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -524,6 +524,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
const RmgrDescData *desc = &RmgrDescTable[XLogRecGetRmid(record)];
uint32 rec_len;
uint32 fpi_len;
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blk;
@@ -556,16 +557,19 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
if (!XLogRecHasBlockRef(record, block_id))
continue;
- XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
+ XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%u: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%u: smgr %d rel %u/%u/%u fork %s blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
else
- printf(", blkref #%u: rel %u/%u/%u blk %u",
+ printf(", blkref #%u: smgr %d rel %u/%u/%u blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
blk);
if (XLogRecHasBlockImage(record, block_id))
@@ -587,9 +591,11 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
if (!XLogRecHasBlockRef(record, block_id))
continue;
- XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%u: rel %u/%u/%u fork %s blk %u",
+ XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blk);
+ printf("\tblkref #%u: smgr %d rel %u/%u/%u fork %s blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index df24089ea45..3ae9d225bbf 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -16,6 +16,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/relfilenode.h"
+#include "storage/smgr.h"
#include "utils/relcache.h"
/*
@@ -45,15 +46,17 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int nbuffers, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
- ForkNumber forknum, BlockNumber blknum, char *page,
- uint8 flags);
+extern void XLogRegisterBlock(uint8 block_id, SmgrId smgrid,
+ RelFileNode *rnode, ForkNumber forknum,
+ BlockNumber blknum,
+ char *page, uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
- BlockNumber blk, char *page, bool page_std);
+extern XLogRecPtr log_newpage(SmgrId smgrid, RelFileNode *rnode,
+ ForkNumber forkNum, BlockNumber blk,
+ char *page, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
BlockNumber startblk, BlockNumber endblk, bool page_std);
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index 04228e2a871..3c545ff95a9 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -26,6 +26,7 @@
#define XLOGREADER_H
#include "access/xlogrecord.h"
+#include "storage/smgr.h"
typedef struct XLogReaderState XLogReaderState;
@@ -43,6 +44,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blkno;
@@ -243,7 +245,7 @@ extern bool DecodeXLogRecord(XLogReaderState *state, XLogRecord *record,
extern bool RestoreBlockImage(XLogReaderState *recoder, uint8 block_id, char *dst);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern bool XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
- BlockNumber *blknum);
+ SmgrId *smgrid, RelFileNode *rnode,
+ ForkNumber *forknum, BlockNumber *blknum);
#endif /* XLOGREADER_H */
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index 4105b59904b..366b8d32812 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -13,6 +13,7 @@
#include "access/xlogreader.h"
#include "storage/bufmgr.h"
+#include "storage/smgr.h"
extern bool XLogHaveInvalidPages(void);
@@ -41,7 +42,8 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(SmgrId smgrid, RelFileNode rnode,
+ ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode);
extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7e7e7..e0d7e8f9fa3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -87,16 +87,20 @@
*
* Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
* to be fixed to zero them, since this struct is used as a hash key.
+ * Conceptually the SmgrId should go first, but we put it next to the
+ * ForkNumber so that it packs better with typical alignment rules.
*/
typedef struct buftag
{
RelFileNode rnode; /* physical relation identifier */
- ForkNumber forkNum;
+ int16 smgrid; /* SmgrId */
+ int16 forkNum; /* ForkNumber */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
+ (a).smgrid = SMGR_INVALID, \
(a).rnode.spcNode = InvalidOid, \
(a).rnode.dbNode = InvalidOid, \
(a).rnode.relNode = InvalidOid, \
@@ -104,8 +108,9 @@ typedef struct buftag
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_smgrid,xx_rnode,xx_forkNum,xx_blockNum) \
( \
+ (a).smgrid = (xx_smgrid), \
(a).rnode = (xx_rnode), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
@@ -113,6 +118,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
+ (a).smgrid == (b).smgrid && \
RelFileNodeEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
@@ -288,6 +294,7 @@ extern BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
+ SmgrId smgrid;
Oid tsId;
Oid relNode;
ForkNumber forkNum;
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7ef1c..013a7de338e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -18,6 +18,7 @@
#include "storage/buf.h"
#include "storage/bufpage.h"
#include "storage/relfilenode.h"
+#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -168,9 +169,9 @@ extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
- ForkNumber forkNum, BlockNumber blockNum,
- ReadBufferMode mode, BufferAccessStrategy strategy);
+extern Buffer ReadBufferWithoutRelcache(SmgrId smgrid, RelFileNode rnode,
+ ForkNumber forkNum, BlockNumber blockNum,
+ ReadBufferMode mode, BufferAccessStrategy strategy);
extern void ReleaseBuffer(Buffer buffer);
extern void UnlockReleaseBuffer(Buffer buffer);
extern void MarkBufferDirty(Buffer buffer);
@@ -205,7 +206,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, SmgrId *smgrid, RelFileNode *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index df24b931613..c0f05e23ff9 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -21,6 +21,7 @@
/* md storage manager functionality */
extern void mdinit(void);
+extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index d286c8c7b11..243efc6cdb1 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -79,8 +79,14 @@ typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+typedef enum SmgrId
+{
+ SMGR_INVALID = -1,
+ SMGR_MD = 0, /* md.c */
+} SmgrId;
+
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(SmgrId which, RelFileNode rnode, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d35b4a5061c..2b19683eb52 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -52,6 +52,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
+ SmgrId rd_smgrid; /* relation storage manager */
RelFileNode rd_node; /* relation physical identifier */
/* use "struct" here to avoid needing to include smgr.h: */
struct SMgrRelationData *rd_smgr; /* cached file handle, or NULL */
@@ -471,7 +472,10 @@ typedef struct ViewOptions
#define RelationOpenSmgr(relation) \
do { \
if ((relation)->rd_smgr == NULL) \
- smgrsetowner(&((relation)->rd_smgr), smgropen((relation)->rd_node, (relation)->rd_backend)); \
+ smgrsetowner(&((relation)->rd_smgr), \
+ smgropen((relation)->rd_smgrid, \
+ (relation)->rd_node, \
+ (relation)->rd_backend)); \
} while (0)
/*
--
2.21.0
On Fri, Jul 12, 2019 at 10:16:21AM +1200, Thomas Munro wrote:
Attempts to keep subtopics separated have so
far failed, so the thread ostensibly about orphaned file cleanup is
now about undo work allocation, but I figured it'd be useful to
highlight this patch separately as it'll be the first to go in, and
it's needed by your work Shawn. So I hope we're still on the same
page with this refactoring patch.
Thanks for reminding me about this thread - I will revisit this again,
had some more feedback after doing my PoC for the pgCon. Need to find
that too...
One thing I'm not sure about is the TODO message in parsexlog.c's
extractPageInfo() function.[1] /messages/by-id/CA+Tgmob4htT-9Tq7eHG3wS=dpKFbQZOyqgSr1iWmV_65Duz6Pw@mail.gmail.com
+
+ /* TODO: How should we handle other smgr IDs? */
+ if (smgrid != SMGR_MD)
continue;
All files are copied verbatim from source to target except for relation
files. So this would include slru data and undo data. From what I read
in the docs, I do not believe we need any special handling for either
new SMGRs and your current code should suffice.
process_block_change() is very relation specific so if different
handling is required by different SMGRs, it would make sense to call on
smgr specific functions instead.
Can't wait for the SMGR_MD to SMGR_REL change :-) It will make
understanding this code a tad bit easier.
--
Shawn Debnath
Amazon Web Services (AWS)
On Fri, Jul 12, 2019 at 11:19 AM Shawn Debnath <sdn@amazon.com> wrote:
On Fri, Jul 12, 2019 at 10:16:21AM +1200, Thomas Munro wrote: + + /* TODO: How should we handle other smgr IDs? */ + if (smgrid != SMGR_MD) continue;All files are copied verbatim from source to target except for relation
files. So this would include slru data and undo data. From what I read
in the docs, I do not believe we need any special handling for either
new SMGRs and your current code should suffice.process_block_change() is very relation specific so if different
handling is required by different SMGRs, it would make sense to call on
smgr specific functions instead.
Right. And since undo and slru etc data will be WAL-logged with block
references, it's entirely possible to teach it to scan them properly,
though it's not clear whether it's worth doing that. Ok, good, TODO
removed.
Can't wait for the SMGR_MD to SMGR_REL change :-) It will make
understanding this code a tad bit easier.
Or could we retrofit different words that start with M and D?
Here's a new version of the patch set (ie the first 3 patches in the
undo patch set, and the part that I think you need for slru work),
this time with the pg_buffercache changes as a separate commit since
it's somewhat independent and has a different (partial) reviewer.
I was starting to think about whether I might be able to commit these,
but now I see that this increase in WAL size is probably not
acceptable:
@@ -727,6 +734,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
+ memcpy(scratch, ®buf->smgrid, sizeof(SmgrId));
+ scratch += sizeof(SmgrId);
memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
scratch += sizeof(RelFileNode);
}
@@ -1220,8 +1221,10 @@ DecodeXLogRecord(XLogReaderState *state,
XLogRecord *record, char **errormsg)
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
+ COPY_HEADER_FIELD(&blk->smgrid, sizeof(SmgrId));
COPY_HEADER_FIELD(&blk->rnode,
sizeof(RelFileNode));
rnode = &blk->rnode;
+ smgrid = blk->smgrid;
}
That's an enum, so it works out to a word per record. The obvious way
to avoid increasing the size is shove the SMGR ID into the same space
that holds the forknum. Unlike BufferTag, where forknum currently
swims in 32 bits which this patch chops in half, XLogRecorBlockHeader
is already crammed into a uint8 fork_flags of which it has only the
lower nibble, and the upper nibble is used for eg BKP_BLOCK_xxx flag
bits, and there isn't even a spare bit to say 'has non-zero SMGR ID'.
Rats. I suppose I could change it to a byte. I wonder if one extra
byte per WAL record is acceptable. Anyone?
--
Thomas Munro
https://enterprisedb.com
Attachments:
0001-Add-SmgrId-to-smgropen-and-BufferTag-v3.patchapplication/octet-stream; name=0001-Add-SmgrId-to-smgropen-and-BufferTag-v3.patchDownload
From 5d991456d26e9b4f35b3653c0049625eac006900 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 15 Jul 2019 21:25:08 +1200
Subject: [PATCH 1/3] Add SmgrId to smgropen() and BufferTag.
To use bufmgr.c for new kinds of data in addition to plain old
relations, add an SMGR argument to places that identify blocks
and the files that hold them (smgropen(), block references in
the WAL, BufferTag).
To avoid making BufferTag wider, take some space away from the
fork number for this new member, since there are just a few
values possible (a suggestion from Anton Shyrabokau).
Create a new callback for smgropen() calls so that some md.c-
specific stuff can move out of smgropen(), and future
implementations can also run their own initialization code.
Author: Thomas Munro
Reviewed-by: Shawn Debnath
Discussion: https://postgr.es/m/CA%2BhUKG%2BOZqOiOuDm5tC5DyQZtJ3FH4%2BFSVMqtdC4P1atpJ%2Bqhg%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
---
contrib/bloom/blinsert.c | 2 +-
src/backend/access/brin/brin_xlog.c | 2 +-
src/backend/access/gin/ginxlog.c | 3 +-
src/backend/access/gist/gistxlog.c | 6 +--
src/backend/access/hash/hash_xlog.c | 12 +++---
src/backend/access/hash/hashpage.c | 6 ++-
src/backend/access/heap/heapam.c | 20 +++++-----
src/backend/access/heap/heapam_handler.c | 2 +-
src/backend/access/heap/rewriteheap.c | 6 ++-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 3 +-
src/backend/access/nbtree/nbtxlog.c | 8 ++--
src/backend/access/spgist/spginsert.c | 6 +--
src/backend/access/spgist/spgxlog.c | 12 +++---
src/backend/access/transam/xlog.c | 6 ++-
src/backend/access/transam/xloginsert.c | 39 ++++++++++++-------
src/backend/access/transam/xlogreader.c | 11 +++++-
src/backend/access/transam/xlogutils.c | 17 ++++----
src/backend/catalog/storage.c | 11 +++---
src/backend/commands/tablecmds.c | 2 +-
src/backend/replication/logical/decode.c | 10 ++---
.../replication/logical/reorderbuffer.c | 3 +-
src/backend/storage/buffer/bufmgr.c | 29 +++++++++-----
src/backend/storage/buffer/localbuf.c | 8 ++--
src/backend/storage/freespace/freespace.c | 3 +-
src/backend/storage/freespace/fsmpage.c | 3 +-
src/backend/storage/smgr/md.c | 29 +++++++++-----
src/backend/storage/smgr/smgr.c | 13 +++----
src/bin/pg_rewind/parsexlog.c | 8 +++-
src/bin/pg_waldump/pg_waldump.c | 16 +++++---
src/include/access/xloginsert.h | 13 ++++---
src/include/access/xlogreader.h | 6 ++-
src/include/access/xlogutils.h | 4 +-
src/include/storage/buf_internals.h | 11 +++++-
src/include/storage/bufmgr.h | 9 +++--
src/include/storage/md.h | 1 +
src/include/storage/smgr.h | 8 +++-
src/include/utils/rel.h | 6 ++-
38 files changed, 223 insertions(+), 133 deletions(-)
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 4b2186b8dda..e39d21df1f6 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -181,7 +181,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c
index db1f47ca218..a13b3cd2575 100644
--- a/src/backend/access/brin/brin_xlog.c
+++ b/src/backend/access/brin/brin_xlog.c
@@ -217,7 +217,7 @@ brin_xlog_revmap_extend(XLogReaderState *record)
xlrec = (xl_brin_revmap_extend *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &targetBlk);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &targetBlk);
Assert(xlrec->targetBlk == targetBlk);
/* Update the metapage */
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b282721..261881c4184 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,11 +95,12 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
+ SmgrId smgrid;
RelFileNode node;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &smgrid, &node, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
node.spcNode, node.dbNode, node.relNode);
}
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index 503db34d863..bf945b9fb50 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -193,7 +193,7 @@ gistRedoDeleteRecord(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
}
@@ -278,7 +278,7 @@ gistRedoPageSplitRecord(XLogReaderState *record)
BlockNumber blkno;
IndexTuple *tuples;
- XLogRecGetBlockTag(record, i + 1, NULL, NULL, &blkno);
+ XLogRecGetBlockTag(record, i + 1, NULL, NULL, NULL, &blkno);
if (blkno == GIST_ROOT_BLKNO)
{
Assert(i == 0);
@@ -313,7 +313,7 @@ gistRedoPageSplitRecord(XLogReaderState *record)
{
BlockNumber nextblkno;
- XLogRecGetBlockTag(record, i + 2, NULL, NULL, &nextblkno);
+ XLogRecGetBlockTag(record, i + 2, NULL, NULL, NULL, &nextblkno);
GistPageGetOpaque(page)->rightlink = nextblkno;
}
else
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index d7b70981101..ec604a7d428 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -51,7 +51,7 @@ hash_xlog_init_meta_page(XLogReaderState *record)
* special handling for init forks as create index operations don't log a
* full page image of the metapage.
*/
- XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(metabuf);
@@ -89,7 +89,7 @@ hash_xlog_init_bitmap_page(XLogReaderState *record)
* special handling for init forks as create index operations don't log a
* full page image of the metapage.
*/
- XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(bitmapbuf);
UnlockReleaseBuffer(bitmapbuf);
@@ -113,7 +113,7 @@ hash_xlog_init_bitmap_page(XLogReaderState *record)
PageSetLSN(page, lsn);
MarkBufferDirty(metabuf);
- XLogRecGetBlockTag(record, 1, NULL, &forknum, NULL);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, &forknum, NULL);
if (forknum == INIT_FORKNUM)
FlushOneBuffer(metabuf);
}
@@ -190,8 +190,8 @@ hash_xlog_add_ovfl_page(XLogReaderState *record)
Size datalen PG_USED_FOR_ASSERTS_ONLY;
bool new_bmpage = false;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &rightblk);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &leftblk);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &rightblk);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &leftblk);
ovflbuf = XLogInitBufferForRedo(record, 0);
Assert(BufferIsValid(ovflbuf));
@@ -1001,7 +1001,7 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
}
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 376ee2a63b5..f6042e42f91 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -427,7 +427,8 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(SMGR_MD,
+ &rel->rd_node,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1021,7 +1022,8 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(SMGR_MD,
+ &rel->rd_node,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index d768b9b061c..96e732cd12f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7725,7 +7725,7 @@ heap_xlog_clean(XLogReaderState *record)
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
@@ -7820,7 +7820,7 @@ heap_xlog_visible(XLogReaderState *record)
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, NULL, &rnode, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -7968,7 +7968,7 @@ heap_xlog_freeze_page(XLogReaderState *record)
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
}
@@ -8040,7 +8040,7 @@ heap_xlog_delete(XLogReaderState *record)
RelFileNode target_node;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &target_node, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -8121,7 +8121,7 @@ heap_xlog_insert(XLogReaderState *record)
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &target_node, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -8243,7 +8243,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &blkno);
/*
* The visibility map may need to be fixed even if the heap page is
@@ -8389,8 +8389,8 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
- if (XLogRecGetBlockTag(record, 1, NULL, NULL, &oldblk))
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &newblk);
+ if (XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &oldblk))
{
/* HOT updates are never done across pages */
Assert(!hot_update);
@@ -8685,7 +8685,7 @@ heap_xlog_lock(XLogReaderState *record)
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &block);
reln = CreateFakeRelcacheEntry(rnode);
visibilitymap_pin(reln, block, &vmbuffer);
@@ -8758,7 +8758,7 @@ heap_xlog_lock_updated(XLogReaderState *record)
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, &block);
reln = CreateFakeRelcacheEntry(rnode);
visibilitymap_pin(reln, block, &vmbuffer);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe98a7..39afef1d3cf 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -634,7 +634,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(SMGR_MD, *newrnode, rel->rd_backend);
RelationOpenSmgr(rel);
/*
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 72a448ad316..f47df39bd28 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -331,7 +331,8 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (state->rs_use_wal)
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(SMGR_MD,
+ &state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -696,7 +697,8 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (state->rs_use_wal)
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(SMGR_MD,
+ &state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
page,
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd5289ad7..f2ce02f07c6 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -172,7 +172,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index d0b9013caf4..931d6cd32b2 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -658,7 +658,8 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the heap NEWPAGE record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(SMGR_MD, &wstate->index->rd_node, MAIN_FORKNUM, blkno,
+ page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index 3147ea47268..1c074f77ac3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -216,9 +216,9 @@ btree_xlog_split(bool onleft, XLogReaderState *record)
BlockNumber rightsib;
BlockNumber rnext;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &leftsib);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &rightsib);
- if (!XLogRecGetBlockTag(record, 2, NULL, NULL, &rnext))
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &leftsib);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &rightsib);
+ if (!XLogRecGetBlockTag(record, 2, NULL, NULL, NULL, &rnext))
rnext = P_NONE;
/*
@@ -524,7 +524,7 @@ btree_xlog_delete(XLogReaderState *record)
{
RelFileNode rnode;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &rnode, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index b40bd440cf0..8019f6839de 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(SMGR_MD, &index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index ebe6ae8715b..3ce35feee69 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -151,7 +151,7 @@ spgRedoAddLeaf(XLogReaderState *record)
SpGistInnerTuple tuple;
BlockNumber blknoLeaf;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &blknoLeaf);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &blknoLeaf);
page = BufferGetPage(buffer);
@@ -184,7 +184,7 @@ spgRedoMoveLeafs(XLogReaderState *record)
XLogRedoAction action;
BlockNumber blknoDst;
- XLogRecGetBlockTag(record, 1, NULL, NULL, &blknoDst);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &blknoDst);
fillFakeState(&state, xldata->stateSrc);
@@ -328,8 +328,8 @@ spgRedoAddNode(XLogReaderState *record)
BlockNumber blkno;
BlockNumber blknoNew;
- XLogRecGetBlockTag(record, 0, NULL, NULL, &blkno);
- XLogRecGetBlockTag(record, 1, NULL, NULL, &blknoNew);
+ XLogRecGetBlockTag(record, 0, NULL, NULL, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, NULL, NULL, NULL, &blknoNew);
/*
* In normal operation we would have all three pages (source, dest,
@@ -549,7 +549,7 @@ spgRedoPickSplit(XLogReaderState *record)
BlockNumber blknoInner;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 2, NULL, NULL, &blknoInner);
+ XLogRecGetBlockTag(record, 2, NULL, NULL, NULL, &blknoInner);
fillFakeState(&state, xldata->stateSrc);
@@ -879,7 +879,7 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
RelFileNode node;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, NULL, &node, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
node);
}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b6c9353cbd2..edcbbf0bc94 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1372,6 +1372,7 @@ checkXLogConsistency(XLogReaderState *record)
ForkNumber forknum;
BlockNumber blkno;
int block_id;
+ SmgrId smgrid;
/* Records with no backup blocks have no need for consistency checks. */
if (!XLogRecHasAnyBlockRefs(record))
@@ -1384,7 +1385,8 @@ checkXLogConsistency(XLogReaderState *record)
Buffer buf;
Page page;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blkno))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -1409,7 +1411,7 @@ checkXLogConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno,
RBM_NORMAL_NO_LOG);
if (!BufferIsValid(buf))
continue;
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d468b5..1697797bc08 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -43,7 +43,8 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ SmgrId smgrid; /* identifies the SGMR, relation and block */
+ RelFileNode rnode;
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->smgrid, ®buf->rnode, ®buf->forkno,
+ ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -248,7 +250,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(regbuf_old->smgrid != regbuf->smgrid ||
+ !RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -263,8 +266,9 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
- BlockNumber blknum, Page page, uint8 flags)
+XLogRegisterBlock(uint8 block_id, SmgrId smgrid, RelFileNode *rnode,
+ ForkNumber forknum, BlockNumber blknum, Page page,
+ uint8 flags)
{
registered_buffer *regbuf;
@@ -280,6 +284,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
+ regbuf->smgrid = smgrid;
regbuf->rnode = *rnode;
regbuf->forkno = forknum;
regbuf->block = blknum;
@@ -303,7 +308,8 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(regbuf_old->smgrid != regbuf->smgrid ||
+ !RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -702,7 +708,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && regbuf->smgrid == prev_regbuf->smgrid &&
+ RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -727,6 +734,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
+ memcpy(scratch, ®buf->smgrid, sizeof(SmgrId));
+ scratch += sizeof(SmgrId);
memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
scratch += sizeof(RelFileNode);
}
@@ -919,6 +928,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -947,8 +957,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &smgrid, &rnode, &forkno, &blkno);
+ XLogRegisterBlock(0, smgrid, &rnode, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -969,8 +979,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
- Page page, bool page_std)
+log_newpage(SmgrId smgrid, RelFileNode *rnode, ForkNumber forkNum,
+ BlockNumber blkno, Page page, bool page_std)
{
int flags;
XLogRecPtr recptr;
@@ -980,7 +990,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, smgrid, rnode, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1009,6 +1019,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1016,9 +1027,9 @@ log_newpage_buffer(Buffer buffer, bool page_std)
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &smgrid, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(smgrid, &rnode, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 33ccfc15531..aeeb7790b7d 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1047,6 +1047,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
uint32 remaining;
uint32 datatotal;
RelFileNode *rnode = NULL;
+ SmgrId smgrid = -1;
uint8 block_id;
ResetDecoder(state);
@@ -1220,8 +1221,10 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
+ COPY_HEADER_FIELD(&blk->smgrid, sizeof(SmgrId));
COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
rnode = &blk->rnode;
+ smgrid = blk->smgrid;
}
else
{
@@ -1233,6 +1236,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg)
goto err;
}
+ blk->smgrid = smgrid;
blk->rnode = *rnode;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
@@ -1340,12 +1344,13 @@ err:
/*
* Returns information about the block that a block reference refers to.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
- * *forknum, and *blknum are filled in (if not NULL), and returns true.
+ * If the WAL record contains a block reference with the given ID, *smgrid,
+ * *rnode, *forknum and *blknum are filled in (if not NULL), and returns true.
* Otherwise returns false.
*/
bool
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
+ SmgrId *smgrid,
RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
{
DecodedBkpBlock *bkpb;
@@ -1354,6 +1359,8 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->blocks[block_id];
+ if (smgrid)
+ *smgrid = bkpb->smgrid;
if (rnode)
*rnode = bkpb->rnode;
if (forknum)
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 10a663bae62..c5f27fb0e17 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -335,8 +335,9 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Page page;
bool zeromode;
bool willinit;
+ SmgrId smgrid;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum, &blkno))
{
/* Caller specified a bogus block_id */
elog(PANIC, "failed to locate backup block with ID %d", block_id);
@@ -357,7 +358,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK);
page = BufferGetPage(*buf);
if (!RestoreBlockImage(record, block_id, page))
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode);
+ *buf = XLogReadBufferExtended(smgrid, rnode, forknum, blkno, mode);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -434,7 +435,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(SmgrId smgrid, RelFileNode rnode, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode)
{
BlockNumber lastblock;
@@ -444,7 +445,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
Assert(blkno != P_NEW);
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(smgrid, rnode, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -461,7 +462,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum, blkno,
mode, NULL);
}
else
@@ -486,7 +487,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum,
P_NEW, mode, NULL);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -496,7 +497,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(smgrid, rnode, forknum, blkno,
mode, NULL);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f7fe2..9509c197786 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -102,7 +102,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(SMGR_MD, rnode, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
@@ -353,7 +353,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(SMGR_MD, &dst->smgr_rnode.node, forkNum, blkno, page,
+ false);
PageSetChecksumInplace(page, blkno);
@@ -428,7 +429,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(SMGR_MD, pending->relnode, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -580,7 +581,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(SMGR_MD, xlrec->rnode, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -589,7 +590,7 @@ smgr_redo(XLogReaderState *record)
SMgrRelation reln;
Relation rel;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(SMGR_MD, xlrec->rnode, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0f1a9f0e548..2896df0d2ba 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12623,7 +12623,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(SMGR_MD, newrnode, rel->rd_backend);
RelationOpenSmgr(rel);
/*
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 151c3ef8825..3e96d2a0752 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -683,7 +683,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -731,7 +731,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -796,7 +796,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
@@ -892,7 +892,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &rnode, NULL, NULL);
if (rnode.dbNode != ctx->slot->data.database)
return;
@@ -991,7 +991,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
RelFileNode target_node;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
+ XLogRecGetBlockTag(r, 0, NULL, &target_node, NULL, NULL);
if (target_node.dbNode != ctx->slot->data.database)
return;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 591377d2cd7..5cd44970110 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3496,6 +3496,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
ReorderBufferTupleCidEnt *ent;
ForkNumber forkno;
BlockNumber blockno;
+ SmgrId smgrid;
bool updated_mapping = false;
/* be careful about padding */
@@ -3507,7 +3508,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &smgrid, &key.relnode, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7332e6b5903..8046334c253 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -555,7 +555,8 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_which,
+ reln->rd_smgr->smgr_rnode.node,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -680,13 +681,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* parameters.
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(SmgrId smgrid, RelFileNode rnode, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(smgrid, rnode, InvalidBackendId);
Assert(InRecovery);
@@ -1009,7 +1010,8 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1843,6 +1845,7 @@ BufferSync(int flags)
buf_state |= BM_CHECKPOINT_NEEDED;
item = &CkptBufferIds[num_to_scan++];
+ item->smgrid = bufHdr->tag.smgrid;
item->buf_id = buf_id;
item->tsId = bufHdr->tag.rnode.spcNode;
item->relNode = bufHdr->tag.rnode.relNode;
@@ -2626,12 +2629,12 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
- * a buffer.
+ * Returns the SMGR ID, relfilenode, fork number and block number
+ * associated with a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
- BlockNumber *blknum)
+BufferGetTag(Buffer buffer, SmgrId *smgrid, RelFileNode *rnode,
+ ForkNumber *forknum, BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2644,6 +2647,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
+ *smgrid = bufHdr->tag.smgrid;
*rnode = bufHdr->tag.rnode;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
@@ -2695,7 +2699,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.smgrid, buf->tag.rnode, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -4220,6 +4224,11 @@ ckpt_buforder_comparator(const void *pa, const void *pb)
const CkptSortItem *a = (const CkptSortItem *) pa;
const CkptSortItem *b = (const CkptSortItem *) pb;
+ /* compare smgr */
+ if (a->smgrid < b->smgrid)
+ return -1;
+ else if (a->smgrid > b->smgrid)
+ return 1;
/* compare tablespace */
if (a->tsId < b->tsId)
return -1;
@@ -4377,7 +4386,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.smgrid, tag.rnode, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29222b..79a185e034f 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,8 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +112,8 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_which,
+ smgr->smgr_rnode.node, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +211,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.smgrid, bufHdr->tag.rnode, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index c17b3f49dd0..78a2274e55c 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -210,7 +210,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR);
+ buf = XLogReadBufferExtended(SMGR_MD, rnode, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f12dd..da3b286ca6c 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,11 +268,12 @@ restart:
*
* Fix the corruption and restart.
*/
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &smgrid, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 58c94e9257a..b2c42cf8f0a 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -120,7 +120,7 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
bool isRedo);
-static MdfdVec *mdopen(SMgrRelation reln, ForkNumber forknum, int behavior);
+static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
@@ -151,6 +151,17 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+/*
+ * mdopen() -- Initialize a newly-opened relation.
+ */
+void
+mdopen(SMgrRelation reln)
+{
+ /* mark it not open */
+ for (int forknum = 0; forknum <= MAX_FORKNUM; forknum++)
+ reln->md_num_open_segs[forknum] = 0;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -165,7 +176,7 @@ mdexists(SMgrRelation reln, ForkNumber forkNum)
*/
mdclose(reln, forkNum);
- return (mdopen(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
+ return (mdopenfork(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
}
/*
@@ -425,7 +436,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
}
/*
- * mdopen() -- Open the specified relation.
+ * mdopenfork() -- Open the specified relation.
*
* Note we only open the first segment, when there are multiple segments.
*
@@ -435,7 +446,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* invent one out of whole cloth.
*/
static MdfdVec *
-mdopen(SMgrRelation reln, ForkNumber forknum, int behavior)
+mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
{
MdfdVec *mdfd;
char *path;
@@ -713,11 +724,11 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopen(reln, forknum, EXTENSION_FAIL);
+ MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
- /* mdopen has opened the first segment */
+ /* mdopenfork has opened the first segment */
Assert(reln->md_num_open_segs[forknum] > 0);
/*
@@ -981,7 +992,7 @@ DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
srels = palloc(sizeof(SMgrRelation) * ndelrels);
for (i = 0; i < ndelrels; i++)
{
- SMgrRelation srel = smgropen(delrels[i], InvalidBackendId);
+ SMgrRelation srel = smgropen(SMGR_MD, delrels[i], InvalidBackendId);
if (isRedo)
{
@@ -1137,7 +1148,7 @@ _mdfd_getseg(SMgrRelation reln, ForkNumber forknum, BlockNumber blkno,
v = &reln->md_seg_fds[forknum][reln->md_num_open_segs[forknum] - 1];
else
{
- v = mdopen(reln, forknum, behavior);
+ v = mdopenfork(reln, forknum, behavior);
if (!v)
return NULL; /* if behavior & EXTENSION_RETURN_NULL */
}
@@ -1254,7 +1265,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(SMGR_MD, ftag->rnode, InvalidBackendId);
MdfdVec *v;
char *p;
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index dba8c397feb..26281fab51d 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -41,6 +41,7 @@ typedef struct f_smgr
{
void (*smgr_init) (void); /* may be NULL */
void (*smgr_shutdown) (void); /* may be NULL */
+ void (*smgr_open) (SMgrRelation reln);
void (*smgr_close) (SMgrRelation reln, ForkNumber forknum);
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
@@ -68,6 +69,7 @@ static const f_smgr smgrsw[] = {
{
.smgr_init = mdinit,
.smgr_shutdown = NULL,
+ .smgr_open = mdopen,
.smgr_close = mdclose,
.smgr_create = mdcreate,
.smgr_exists = mdexists,
@@ -141,7 +143,7 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(SmgrId smgrid, RelFileNode rnode, BackendId backend)
{
RelFileNodeBackend brnode;
SMgrRelation reln;
@@ -170,18 +172,15 @@ smgropen(RelFileNode rnode, BackendId backend)
/* Initialize it if not present before */
if (!found)
{
- int forknum;
-
/* hash_search already filled in the lookup key */
reln->smgr_owner = NULL;
reln->smgr_targblock = InvalidBlockNumber;
reln->smgr_fsm_nblocks = InvalidBlockNumber;
reln->smgr_vm_nblocks = InvalidBlockNumber;
- reln->smgr_which = 0; /* we only have md.c at present */
+ reln->smgr_which = smgrid;
- /* mark it not open */
- for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- reln->md_num_open_segs[forknum] = 0;
+ /* implementation-specific initialization */
+ smgrsw[reln->smgr_which].smgr_open(reln);
/* it has no owner yet */
dlist_push_tail(&unowned_relns, &reln->node);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 287af60c4e7..89c183abc61 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -394,8 +394,14 @@ extractPageInfo(XLogReaderState *record)
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blkno;
+ SmgrId smgrid;
- if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
+ if (!XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blkno))
+ continue;
+
+ /* We only care about SMGR_MD for now */
+ if (smgrid != SMGR_MD)
continue;
/* We only care about the main fork; others are copied in toto */
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index b95d467805a..9b9b4502b3b 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -524,6 +524,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
const RmgrDescData *desc = &RmgrDescTable[XLogRecGetRmid(record)];
uint32 rec_len;
uint32 fpi_len;
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blk;
@@ -556,16 +557,19 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
if (!XLogRecHasBlockRef(record, block_id))
continue;
- XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
+ XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%u: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%u: smgr %d rel %u/%u/%u fork %s blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
else
- printf(", blkref #%u: rel %u/%u/%u blk %u",
+ printf(", blkref #%u: smgr %d rel %u/%u/%u blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
blk);
if (XLogRecHasBlockImage(record, block_id))
@@ -587,9 +591,11 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
if (!XLogRecHasBlockRef(record, block_id))
continue;
- XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%u: rel %u/%u/%u fork %s blk %u",
+ XLogRecGetBlockTag(record, block_id, &smgrid, &rnode, &forknum,
+ &blk);
+ printf("\tblkref #%u: smgr %d rel %u/%u/%u fork %s blk %u",
block_id,
+ smgrid,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index df24089ea45..3ae9d225bbf 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -16,6 +16,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/relfilenode.h"
+#include "storage/smgr.h"
#include "utils/relcache.h"
/*
@@ -45,15 +46,17 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int nbuffers, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
- ForkNumber forknum, BlockNumber blknum, char *page,
- uint8 flags);
+extern void XLogRegisterBlock(uint8 block_id, SmgrId smgrid,
+ RelFileNode *rnode, ForkNumber forknum,
+ BlockNumber blknum,
+ char *page, uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
- BlockNumber blk, char *page, bool page_std);
+extern XLogRecPtr log_newpage(SmgrId smgrid, RelFileNode *rnode,
+ ForkNumber forkNum, BlockNumber blk,
+ char *page, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
BlockNumber startblk, BlockNumber endblk, bool page_std);
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index a12c94cba67..6ec50b4151e 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -30,6 +30,7 @@
#endif
#include "access/xlogrecord.h"
+#include "storage/smgr.h"
typedef struct XLogReaderState XLogReaderState;
@@ -47,6 +48,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
+ SmgrId smgrid;
RelFileNode rnode;
ForkNumber forknum;
BlockNumber blkno;
@@ -251,7 +253,7 @@ extern FullTransactionId XLogRecGetFullXid(XLogReaderState *record);
extern bool RestoreBlockImage(XLogReaderState *recoder, uint8 block_id, char *dst);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern bool XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
- BlockNumber *blknum);
+ SmgrId *smgrid, RelFileNode *rnode,
+ ForkNumber *forknum, BlockNumber *blknum);
#endif /* XLOGREADER_H */
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index 4105b59904b..366b8d32812 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -13,6 +13,7 @@
#include "access/xlogreader.h"
#include "storage/bufmgr.h"
+#include "storage/smgr.h"
extern bool XLogHaveInvalidPages(void);
@@ -41,7 +42,8 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(SmgrId smgrid, RelFileNode rnode,
+ ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode);
extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7e7e7..e0d7e8f9fa3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -87,16 +87,20 @@
*
* Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
* to be fixed to zero them, since this struct is used as a hash key.
+ * Conceptually the SmgrId should go first, but we put it next to the
+ * ForkNumber so that it packs better with typical alignment rules.
*/
typedef struct buftag
{
RelFileNode rnode; /* physical relation identifier */
- ForkNumber forkNum;
+ int16 smgrid; /* SmgrId */
+ int16 forkNum; /* ForkNumber */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
+ (a).smgrid = SMGR_INVALID, \
(a).rnode.spcNode = InvalidOid, \
(a).rnode.dbNode = InvalidOid, \
(a).rnode.relNode = InvalidOid, \
@@ -104,8 +108,9 @@ typedef struct buftag
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_smgrid,xx_rnode,xx_forkNum,xx_blockNum) \
( \
+ (a).smgrid = (xx_smgrid), \
(a).rnode = (xx_rnode), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
@@ -113,6 +118,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
+ (a).smgrid == (b).smgrid && \
RelFileNodeEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
@@ -288,6 +294,7 @@ extern BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
+ SmgrId smgrid;
Oid tsId;
Oid relNode;
ForkNumber forkNum;
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7ef1c..013a7de338e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -18,6 +18,7 @@
#include "storage/buf.h"
#include "storage/bufpage.h"
#include "storage/relfilenode.h"
+#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -168,9 +169,9 @@ extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
- ForkNumber forkNum, BlockNumber blockNum,
- ReadBufferMode mode, BufferAccessStrategy strategy);
+extern Buffer ReadBufferWithoutRelcache(SmgrId smgrid, RelFileNode rnode,
+ ForkNumber forkNum, BlockNumber blockNum,
+ ReadBufferMode mode, BufferAccessStrategy strategy);
extern void ReleaseBuffer(Buffer buffer);
extern void UnlockReleaseBuffer(Buffer buffer);
extern void MarkBufferDirty(Buffer buffer);
@@ -205,7 +206,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, SmgrId *smgrid, RelFileNode *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index df24b931613..c0f05e23ff9 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -21,6 +21,7 @@
/* md storage manager functionality */
extern void mdinit(void);
+extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index d286c8c7b11..243efc6cdb1 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -79,8 +79,14 @@ typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+typedef enum SmgrId
+{
+ SMGR_INVALID = -1,
+ SMGR_MD = 0, /* md.c */
+} SmgrId;
+
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(SmgrId which, RelFileNode rnode, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d35b4a5061c..2b19683eb52 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -52,6 +52,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
+ SmgrId rd_smgrid; /* relation storage manager */
RelFileNode rd_node; /* relation physical identifier */
/* use "struct" here to avoid needing to include smgr.h: */
struct SMgrRelationData *rd_smgr; /* cached file handle, or NULL */
@@ -471,7 +472,10 @@ typedef struct ViewOptions
#define RelationOpenSmgr(relation) \
do { \
if ((relation)->rd_smgr == NULL) \
- smgrsetowner(&((relation)->rd_smgr), smgropen((relation)->rd_node, (relation)->rd_backend)); \
+ smgrsetowner(&((relation)->rd_smgr), \
+ smgropen((relation)->rd_smgrid, \
+ (relation)->rd_node, \
+ (relation)->rd_backend)); \
} while (0)
/*
--
2.21.0
0002-Add-smgrid-column-to-pg_buffercache-v3.patchapplication/octet-stream; name=0002-Add-smgrid-column-to-pg_buffercache-v3.patchDownload
From e93afd4adb36676298ef0ceb48721e4f1d1ce8b0 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 15 Jul 2019 21:26:41 +1200
Subject: [PATCH 2/3] Add smgrid column to pg_buffercache.
Buffer tags now include an SMGR selector. Show it as a new column in the
pg_buffercache extension.
Author: Thomas Munro
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA%2BhUKG%2BOZqOiOuDm5tC5DyQZtJ3FH4%2BFSVMqtdC4P1atpJ%2Bqhg%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.2.sql | 21 ----------
.../pg_buffercache--1.3--1.4.sql | 36 ++++++++++++++++
.../pg_buffercache/pg_buffercache--1.4.sql | 41 +++++++++++++++++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 15 +++++--
doc/src/sgml/pgbuffercache.sgml | 7 ++++
7 files changed, 100 insertions(+), 26 deletions(-)
delete mode 100644 contrib/pg_buffercache/pg_buffercache--1.2.sql
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 18f7a874524..d76ac243d3e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -4,7 +4,9 @@ MODULE_big = pg_buffercache
OBJS = pg_buffercache_pages.o $(WIN32RES)
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
+DATA = \
+ pg_buffercache--1.4.sql \
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
pg_buffercache--unpackaged--1.0.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.2.sql
deleted file mode 100644
index 6ee5d8435bd..00000000000
--- a/contrib/pg_buffercache/pg_buffercache--1.2.sql
+++ /dev/null
@@ -1,21 +0,0 @@
-/* contrib/pg_buffercache/pg_buffercache--1.2.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION pg_buffercache" to load this file. \quit
-
--- Register the function.
-CREATE FUNCTION pg_buffercache_pages()
-RETURNS SETOF RECORD
-AS 'MODULE_PATHNAME', 'pg_buffercache_pages'
-LANGUAGE C PARALLEL SAFE;
-
--- Create a view for convenient access.
-CREATE VIEW pg_buffercache AS
- SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
- relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
- pinning_backends int4);
-
--- Don't want these to be available to public.
-REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
-REVOKE ALL ON pg_buffercache FROM PUBLIC;
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 00000000000..ab6d20a5ccf
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,36 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+DROP VIEW pg_buffercache;
+
+CREATE VIEW pg_buffercache AS
+ SELECT bufferid,
+ smgrid,
+ relfilenode,
+ reltablespace,
+ reldatabase,
+ relforknumber,
+ relblocknumber,
+ isdirty,
+ usagecount,
+ pinning_backends
+ FROM pg_buffercache_pages() AS P(
+ bufferid integer,
+ relfilenode oid,
+ reltablespace oid,
+ reldatabase oid,
+ relforknumber int2,
+ relblocknumber int8,
+ isdirty bool,
+ usagecount int2,
+ pinning_backends int4,
+ smgrid int2);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.4.sql
new file mode 100644
index 00000000000..9ae167abf0e
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4.sql
@@ -0,0 +1,41 @@
+/* contrib/pg_buffercache/pg_buffercache--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_buffercache" to load this file. \quit
+
+-- Register the function.
+CREATE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages'
+LANGUAGE C PARALLEL SAFE;
+
+-- Create a view for convenient access.
+CREATE VIEW pg_buffercache AS
+ SELECT bufferid,
+ smgrid,
+ relfilenode,
+ reltablespace,
+ reldatabase,
+ relforknumber,
+ relblocknumber,
+ isdirty,
+ usagecount,
+ pinning_backends
+ FROM pg_buffercache_pages() AS P(
+ bufferid integer,
+ relfilenode oid,
+ reltablespace oid,
+ reldatabase oid,
+ relforknumber int2,
+ relblocknumber int8,
+ isdirty bool,
+ usagecount int2,
+ pinning_backends int4,
+ smgrid int2);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae9abf..a82ae5f9bb5 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579fcbb0..2754c1e40e9 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -16,7 +16,7 @@
#define NUM_BUFFERCACHE_PAGES_MIN_ELEM 8
-#define NUM_BUFFERCACHE_PAGES_ELEM 9
+#define NUM_BUFFERCACHE_PAGES_ELEM 10
PG_MODULE_MAGIC;
@@ -25,6 +25,7 @@ PG_MODULE_MAGIC;
*/
typedef struct
{
+ SmgrId smgrid;
uint32 bufferid;
Oid relfilenode;
Oid reltablespace;
@@ -116,10 +117,12 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
BOOLOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 8, "usage_count",
INT2OID, -1, 0);
-
- if (expected_tupledesc->natts == NUM_BUFFERCACHE_PAGES_ELEM)
+ if (expected_tupledesc->natts >= 9)
TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
INT4OID, -1, 0);
+ if (expected_tupledesc->natts >= 10)
+ TupleDescInitEntry(tupledesc, (AttrNumber) 10, "smgrid",
+ INT2OID, -1, 0);
fctx->tupdesc = BlessTupleDesc(tupledesc);
@@ -153,6 +156,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
+ fctx->record[i].smgrid = bufHdr->tag.smgrid;
fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
@@ -206,6 +210,8 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
nulls[7] = true;
/* unused for v1.0 callers, but the array is always long enough */
nulls[8] = true;
+ /* unused for < v1.4 callers, but the array is always long enough */
+ nulls[9] = true;
}
else
{
@@ -226,6 +232,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
/* unused for v1.0 callers, but the array is always long enough */
values[8] = Int32GetDatum(fctx->record[i].pinning_backends);
nulls[8] = false;
+ /* unused for < v1.4 callers, but the array is always long enough */
+ values[9] = Int16GetDatum(fctx->record[i].smgrid);
+ nulls[9] = false;
}
/* Build and return the tuple. */
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index faf5a3115dc..a0a7be32b4b 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -57,6 +57,13 @@
<entry>ID, in the range 1..<varname>shared_buffers</varname></entry>
</row>
+ <row>
+ <entry><structfield>smgrid</structfield></entry>
+ <entry><type>smallint</type></entry>
+ <entry></entry>
+ <entry>Block storage manager ID. 0 for regular relation data.</entry>
+ </row>
+
<row>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
--
2.21.0
0003-Move-tablespace-dir-creation-from-smgr.c-to-md.c-v3.patchapplication/octet-stream; name=0003-Move-tablespace-dir-creation-from-smgr.c-to-md.c-v3.patchDownload
From d6cbee2e4b81dc22b33115db0ba8e7feb5be5c12 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 30 Apr 2019 22:11:03 +1200
Subject: [PATCH 3/3] Move tablespace dir creation from smgr.c to md.c.
For potential future SMGR implementation, we may not need to create
tablespace directories when opening a relation. Make that md.c
specific.
Author: Thomas Munro
---
src/backend/storage/smgr/md.c | 14 ++++++++++++++
src/backend/storage/smgr/smgr.c | 14 --------------
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index b2c42cf8f0a..a426e2d36bd 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,6 +28,7 @@
#include "miscadmin.h"
#include "access/xlogutils.h"
#include "access/xlog.h"
+#include "commands/tablespace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
@@ -196,6 +197,19 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
Assert(reln->md_num_open_segs[forkNum] == 0);
+ /*
+ * We may be using the target table space for the first time in this
+ * database, so create a per-database subdirectory if needed.
+ *
+ * XXX this is a fairly ugly violation of module layering, but this seems
+ * to be the best place to put the check. Maybe TablespaceCreateDbspace
+ * should be here and not in commands/tablespace.c? But that would imply
+ * importing a lot of stuff that smgr.c oughtn't know, either.
+ */
+ TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
+ reln->smgr_rnode.node.dbNode,
+ isRedo);
+
path = relpath(reln->smgr_rnode, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 26281fab51d..4ba07a08f54 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -17,7 +17,6 @@
*/
#include "postgres.h"
-#include "commands/tablespace.h"
#include "lib/ilist.h"
#include "storage/bufmgr.h"
#include "storage/ipc.h"
@@ -343,19 +342,6 @@ smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
if (isRedo && reln->md_num_open_segs[forknum] > 0)
return;
- /*
- * We may be using the target table space for the first time in this
- * database, so create a per-database subdirectory if needed.
- *
- * XXX this is a fairly ugly violation of module layering, but this seems
- * to be the best place to put the check. Maybe TablespaceCreateDbspace
- * should be here and not in commands/tablespace.c? But that would imply
- * importing a lot of stuff that smgr.c oughtn't know, either.
- */
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- isRedo);
-
smgrsw[reln->smgr_which].smgr_create(reln, forknum, isRedo);
}
--
2.21.0
On Mon, Jul 15, 2019 at 6:59 AM Thomas Munro <thomas.munro@gmail.com> wrote:
That's an enum, so it works out to a word per record. The obvious way
to avoid increasing the size is shove the SMGR ID into the same space
that holds the forknum. Unlike BufferTag, where forknum currently
swims in 32 bits which this patch chops in half, XLogRecorBlockHeader
is already crammed into a uint8 fork_flags of which it has only the
lower nibble, and the upper nibble is used for eg BKP_BLOCK_xxx flag
bits, and there isn't even a spare bit to say 'has non-zero SMGR ID'.
Rats. I suppose I could change it to a byte. I wonder if one extra
byte per WAL record is acceptable. Anyone?
OK, I'll bite: I don't like it. I think this patch is more about how
people feel about things than it is about a technically necessary
change, and I'm absolutely OK with that up to the point where it
starts to inflict measurable costs on our users. Making WAL records
bigger in common cases, even by 1 byte, is a measurable cost. And
there are a few other minor costs too: we whack around a bunch of
internal APIs, and we force a pg_buffercache version bump. And I am
of the opinion that none of those costs, big or small, are buying us
anything technically. I am OK with being convinced otherwise, but
right now I am not convinced.
To set forth my argument: I think magic database OIDs are just fine.
The contrary arguments as I understand them are (1) stuff might break
if there's no matching entry in pg_database, or if there is, and (2)
some hypothetical smgr might need the database OID as a discriminator.
My counter-arguments are (1) we can fix that by writing the
appropriate code and it doesn't even seem very hard and (2) tough
noogies. To expand on (2) slightly, the proposals on the table do not
need that, the existing smgr does not need that, and there's no reason
to suppose that future proposals would require that either, because
2^32 relfilenodes of up to 2^32 blocks each is a lot, and you
shouldn't need another 2^32 bits. If someone does come up with a
proposal that needs those bits, perhaps because it lives within a
database rather than being a global object like SLRU or undo data,
maybe it should be a new kind of AM rather than a new smgr. And if
not, then maybe we should leave it to that hypothetical patch to solve
that hypothetical problem, because right now we're just speculating
that another 32 bits will fix it, which we can't really know, because
if we're hypothesizing the existence of a patch that needs more bits,
we could also hypothesize that it needs more than 32 of them.
If we absolutely have to keep driving down this course, you could
probably steal a bit from the fork number nibble to indicate a
non-default smgr. Even if there are only 2 bits there, you could use
1 for non-default smgr and 1 for non-default fork number, and then in
the common case of references to the default block of the default
smgr, you wouldn't be spending anything additional ... assuming you
don't count the CPU cycles to encode and decode a more complex WAL
record format.
But how about just using a magic database OID?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Jul 16, 2019 at 1:49 AM Robert Haas <robertmhaas@gmail.com> wrote:
[long form -1]
But how about just using a magic database OID?
This patch was just an experiment based on discussion here:
/messages/by-id/CA+hUKG+DE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo+83vvw@mail.gmail.com
I learned some things. The main one is that you don't just need space
the buffer tag (which has plenty of spare bits) but also in WAL block
references, and that does seem to be a strike against the idea. I
don't want lack of agreement here to hold up other work. So here's
what I propose:
I'll go and commit the simple refactoring bits of this work, which
just move some stuff belonging to md.c out of smgr.c (see attached).
I'll go back to using a magic database OID for the undo log patch set
for now. We could always reconsider the SMGR discriminator later.
For now I'm not going to consider this question a blocker for the
later undo code when it's eventually ready for commit.
--
Thomas Munro
https://enterprisedb.com
Attachments:
0001-Move-some-md.c-specific-logic-from-smgr.c-to-md.c.patchapplication/octet-stream; name=0001-Move-some-md.c-specific-logic-from-smgr.c-to-md.c.patchDownload
From a51ee45d0397f1009140e95c985b63e9c742b9ac Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 30 Apr 2019 22:11:03 +1200
Subject: [PATCH] Move some md.c-specific logic from smgr.c to md.c.
Potential future SMGR implementations may not want to create
tablespace directories when creating a SMGR relation. Move that
logic to mdcreate(). Move the initialization of md-specific
data structures from smgropen() to a new callback mdopen().
Author: Thomas Munro
Discussion: https://postgr.es/m/CA%2BhUKG%2BOZqOiOuDm5tC5DyQZtJ3FH4%2BFSVMqtdC4P1atpJ%2Bqhg%40mail.gmail.com
---
src/backend/storage/smgr/md.c | 37 +++++++++++++++++++++++++++------
src/backend/storage/smgr/smgr.c | 33 ++++-------------------------
src/include/storage/md.h | 1 +
3 files changed, 36 insertions(+), 35 deletions(-)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 58c94e9257a..52136ad5580 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,6 +28,7 @@
#include "miscadmin.h"
#include "access/xlogutils.h"
#include "access/xlog.h"
+#include "commands/tablespace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
@@ -120,7 +121,7 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
bool isRedo);
-static MdfdVec *mdopen(SMgrRelation reln, ForkNumber forknum, int behavior);
+static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
@@ -165,7 +166,7 @@ mdexists(SMgrRelation reln, ForkNumber forkNum)
*/
mdclose(reln, forkNum);
- return (mdopen(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
+ return (mdopenfork(reln, forkNum, EXTENSION_RETURN_NULL) != NULL);
}
/*
@@ -185,6 +186,19 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
Assert(reln->md_num_open_segs[forkNum] == 0);
+ /*
+ * We may be using the target table space for the first time in this
+ * database, so create a per-database subdirectory if needed.
+ *
+ * XXX this is a fairly ugly violation of module layering, but this seems
+ * to be the best place to put the check. Maybe TablespaceCreateDbspace
+ * should be here and not in commands/tablespace.c? But that would imply
+ * importing a lot of stuff that smgr.c oughtn't know, either.
+ */
+ TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
+ reln->smgr_rnode.node.dbNode,
+ isRedo);
+
path = relpath(reln->smgr_rnode, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
@@ -425,7 +439,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
}
/*
- * mdopen() -- Open the specified relation.
+ * mdopenfork() -- Open one fork of the specified relation.
*
* Note we only open the first segment, when there are multiple segments.
*
@@ -435,7 +449,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* invent one out of whole cloth.
*/
static MdfdVec *
-mdopen(SMgrRelation reln, ForkNumber forknum, int behavior)
+mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
{
MdfdVec *mdfd;
char *path;
@@ -474,6 +488,17 @@ mdopen(SMgrRelation reln, ForkNumber forknum, int behavior)
return mdfd;
}
+/*
+ * mdopen() -- Initialize newly-opened relation.
+ */
+void
+mdopen(SMgrRelation reln)
+{
+ /* mark it not open */
+ for (int forknum = 0; forknum <= MAX_FORKNUM; forknum++)
+ reln->md_num_open_segs[forknum] = 0;
+}
+
/*
* mdclose() -- Close the specified relation, if it isn't closed already.
*/
@@ -713,7 +738,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopen(reln, forknum, EXTENSION_FAIL);
+ MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
@@ -1137,7 +1162,7 @@ _mdfd_getseg(SMgrRelation reln, ForkNumber forknum, BlockNumber blkno,
v = &reln->md_seg_fds[forknum][reln->md_num_open_segs[forknum] - 1];
else
{
- v = mdopen(reln, forknum, behavior);
+ v = mdopenfork(reln, forknum, behavior);
if (!v)
return NULL; /* if behavior & EXTENSION_RETURN_NULL */
}
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index dba8c397feb..b0d9f21e688 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -17,7 +17,6 @@
*/
#include "postgres.h"
-#include "commands/tablespace.h"
#include "lib/ilist.h"
#include "storage/bufmgr.h"
#include "storage/ipc.h"
@@ -41,6 +40,7 @@ typedef struct f_smgr
{
void (*smgr_init) (void); /* may be NULL */
void (*smgr_shutdown) (void); /* may be NULL */
+ void (*smgr_open) (SMgrRelation reln);
void (*smgr_close) (SMgrRelation reln, ForkNumber forknum);
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
@@ -68,6 +68,7 @@ static const f_smgr smgrsw[] = {
{
.smgr_init = mdinit,
.smgr_shutdown = NULL,
+ .smgr_open = mdopen,
.smgr_close = mdclose,
.smgr_create = mdcreate,
.smgr_exists = mdexists,
@@ -170,8 +171,6 @@ smgropen(RelFileNode rnode, BackendId backend)
/* Initialize it if not present before */
if (!found)
{
- int forknum;
-
/* hash_search already filled in the lookup key */
reln->smgr_owner = NULL;
reln->smgr_targblock = InvalidBlockNumber;
@@ -179,9 +178,8 @@ smgropen(RelFileNode rnode, BackendId backend)
reln->smgr_vm_nblocks = InvalidBlockNumber;
reln->smgr_which = 0; /* we only have md.c at present */
- /* mark it not open */
- for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- reln->md_num_open_segs[forknum] = 0;
+ /* implementation-specific initialization */
+ smgrsw[reln->smgr_which].smgr_open(reln);
/* it has no owner yet */
dlist_push_tail(&unowned_relns, &reln->node);
@@ -330,33 +328,10 @@ smgrclosenode(RelFileNodeBackend rnode)
* Given an already-created (but presumably unused) SMgrRelation,
* cause the underlying disk file or other storage for the fork
* to be created.
- *
- * If isRedo is true, it is okay for the underlying file to exist
- * already because we are in a WAL replay sequence.
*/
void
smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
{
- /*
- * Exit quickly in WAL replay mode if we've already opened the file. If
- * it's open, it surely must exist.
- */
- if (isRedo && reln->md_num_open_segs[forknum] > 0)
- return;
-
- /*
- * We may be using the target table space for the first time in this
- * database, so create a per-database subdirectory if needed.
- *
- * XXX this is a fairly ugly violation of module layering, but this seems
- * to be the best place to put the check. Maybe TablespaceCreateDbspace
- * should be here and not in commands/tablespace.c? But that would imply
- * importing a lot of stuff that smgr.c oughtn't know, either.
- */
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- isRedo);
-
smgrsw[reln->smgr_which].smgr_create(reln, forknum, isRedo);
}
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index df24b931613..c0f05e23ff9 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -21,6 +21,7 @@
/* md storage manager functionality */
extern void mdinit(void);
+extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
--
2.21.0
On Tue, Jul 16, 2019 at 10:49 AM Thomas Munro <thomas.munro@gmail.com> wrote:
I'll go and commit the simple refactoring bits of this work, which
just move some stuff belonging to md.c out of smgr.c (see attached).
Pushed. The rest of that earlier patch set is hereby abandoned (at
least for now). I'll be posting a new-and-improved undo log patch set
soon, now a couple of patches smaller but back to magic database 9. I
think I'll probably do that with a new catalog header file that
defines pseudo-database OIDs.
--
Thomas Munro
https://enterprisedb.com
On Tue, Jul 16, 2019 at 10:49:39AM +1200, Thomas Munro wrote:
On Tue, Jul 16, 2019 at 1:49 AM Robert Haas <robertmhaas@gmail.com> wrote:
[long form -1]
But how about just using a magic database OID?
This patch was just an experiment based on discussion here:
/messages/by-id/CA+hUKG+DE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo+83vvw@mail.gmail.com
I learned some things. The main one is that you don't just need space
the buffer tag (which has plenty of spare bits) but also in WAL block
references, and that does seem to be a strike against the idea. I
don't want lack of agreement here to hold up other work. So here's
what I propose:I'll go and commit the simple refactoring bits of this work, which
just move some stuff belonging to md.c out of smgr.c (see attached).
I'll go back to using a magic database OID for the undo log patch set
for now. We could always reconsider the SMGR discriminator later.
For now I'm not going to consider this question a blocker for the
later undo code when it's eventually ready for commit.
Agree that we should move on at this point. The magic OIDs do not block
us from moving to this model later if needed.
--
Shawn Debnath
Amazon Web Services (AWS)
On Wed, Jul 17, 2019 at 03:01:47PM +1200, Thomas Munro wrote:
On Tue, Jul 16, 2019 at 10:49 AM Thomas Munro <thomas.munro@gmail.com> wrote:
I'll go and commit the simple refactoring bits of this work, which
just move some stuff belonging to md.c out of smgr.c (see attached).Pushed. The rest of that earlier patch set is hereby abandoned (at
least for now). I'll be posting a new-and-improved undo log patch set
soon, now a couple of patches smaller but back to magic database 9. I
think I'll probably do that with a new catalog header file that
defines pseudo-database OIDs.
One suggestion, let's expose the magic oids via a dedicated catalog
pg_smgr so that they can be reserved and accounted for via the scripts
as discussed in [1]/messages/by-id/20180821184835.GA1032@60f81dc409fc.ant.amazon.com. There were suggestions in the thread to use pg_am,
but with the revised pg_am [2]https://www.postgresql.org/docs/devel/catalog-pg-am.html, it seems we will be stretching the
meaning of access methods quite a bit, in my opinion, incorrectly.
The benefit of having a dedicated catalog is that we can expose data
particular to smgrs that do not fit in the access methods scope.
[1]: /messages/by-id/20180821184835.GA1032@60f81dc409fc.ant.amazon.com
/messages/by-id/20180821184835.GA1032@60f81dc409fc.ant.amazon.com
[2]: https://www.postgresql.org/docs/devel/catalog-pg-am.html
--
Shawn Debnath
Amazon Web Services (AWS)